The best dictation software in 2024

These speech-to-text apps will save you time without sacrificing accuracy..

Best text dictation apps hero

The early days of dictation software were like your friend that mishears lyrics: lots of enthusiasm but little accuracy. Now, AI is out of Pandora's box, both in the news and in the apps we use, and dictation apps are getting better and better because of it. It's still not 100% perfect, but you'll definitely feel more in control when using your voice to type.

I took to the internet to find the best speech-to-text software out there right now, and after monologuing at length in front of dozens of dictation apps, these are my picks for the best.

The best dictation software

Windows 11 Speech Recognition for free dictation software on Windows

Dragon by Nuance for a customizable dictation app

Google Docs voice typing for dictating in Google Docs

Gboard for a free mobile dictation app

Otter for collaboration

What is dictation software?

When searching for dictation software online, you'll come across a wide range of options. The ones I'm focusing on here are apps or services that you can quickly open, start talking, and see the results on your screen in (near) real-time. This is great for taking quick notes , writing emails without typing, or talking out an entire novel while you walk in your favorite park—because why not.

Beyond these productivity uses, people with disabilities or with carpal tunnel syndrome can use this software to type more easily. It makes technology more accessible to everyone .

If this isn't what you're looking for, here's what else is out there:

AI assistants, such as Apple's Siri, Amazon's Alexa, and Microsoft's Cortana, can help you interact with each of these ecosystems to send texts, buy products, or schedule events on your calendar.

AI meeting assistants will join your meetings and transcribe everything, generating meeting notes to share with your team.

AI transcription platforms can process your video and audio files into neat text.

Transcription services that use a combination of dictation software, AI, and human proofreaders can achieve above 99% accuracy.

There are also advanced platforms for enterprise, like Amazon Transcribe and Microsoft Azure's speech-to-text services.

What makes a great dictation app?

How we evaluate and test apps.

Our best apps roundups are written by humans who've spent much of their careers using, testing, and writing about software. Unless explicitly stated, we spend dozens of hours researching and testing apps, using each app as it's intended to be used and evaluating it against the criteria we set for the category. We're never paid for placement in our articles from any app or for links to any site—we value the trust readers put in us to offer authentic evaluations of the categories and apps we review. For more details on our process, read the full rundown of how we select apps to feature on the Zapier blog .

Dictation software comes in different shapes and sizes. Some are integrated in products you already use. Others are separate apps that offer a range of extra features. While each can vary in look and feel, here's what I looked for to find the best:

High accuracy. Staying true to what you're saying is the most important feature here. The lowest score on this list is at 92% accuracy.

Ease of use. This isn't a high hurdle, as most options are basic enough that anyone can figure them out in seconds.

Availability of voice commands. These let you add "instructions" while you're dictating, such as adding punctuation, starting a new paragraph, or more complex commands like capitalizing all the words in a sentence.

Availability of the languages supported. Most of the picks here support a decent (or impressive) number of languages.

Versatility. I paid attention to how well the software could adapt to different circumstances, apps, and systems.

I tested these apps by reading a 200-word script containing numbers, compound words, and a few tricky terms. I read the script three times for each app: the accuracy scores are an average of all attempts. Finally, I used the voice commands to delete and format text and to control the app's features where available.

I used my laptop's or smartphone's microphone to test these apps in a quiet room without background noise. For occasional dictation, an equivalent microphone on your own computer or smartphone should do the job well. If you're doing a lot of dictation every day, it's probably worth investing in an external microphone, like the Jabra Evolve .

What about AI?

Before the ChatGPT boom, AI wasn't as hot a keyword, but it already existed. The apps on this list use a combination of technologies that may include AI— machine learning and natural language processing (NLP) in particular. While they could rebrand themselves to keep up with the hype, they may use pipelines or models that aren't as bleeding-edge when compared to what's going on in Hugging Face or under OpenAI Whisper 's hood, for example. 

Also, since this isn't a hot AI software category, these apps may prefer to focus on their core offering and product quality instead, not ride the trendy wave by slapping "AI-powered" on every web page.

Tips for using voice recognition software

Though dictation software is pretty good at recognizing different voices, it's not perfect. Here are some tips to make it work as best as possible.

Speak naturally (with caveats). Dictation apps learn your voice and speech patterns over time. And if you're going to spend any time with them, you want to be comfortable. Speak naturally. If you're not getting 90% accuracy initially, try enunciating more.  

Punctuate. When you dictate, you have to say each period, comma, question mark, and so forth. The software isn't always smart enough to figure it out on its own.

Learn a few commands . Take the time to learn a few simple commands, such as "new line" to enter a line break. There are different commands for composing, editing, and operating your device. Commands may differ from app to app, so learn the ones that apply to the tool you choose.

Know your limits. Especially on mobile devices, some tools have a time limit for how long they can listen—sometimes for as little as 10 seconds. Glance at the screen from time to time to make sure you haven't blown past the mark. 

Practice. It takes time to adjust to voice recognition software, but it gets easier the more you practice. Some of the more sophisticated apps invite you to train by reading passages or doing other short drills. Don't shy away from tutorials, help menus, and on-screen cheat sheets.

The best dictation software at a glance

Best free dictation software for apple devices, apple dictation (ios, ipados, macos).

The interface for Apple Dictation, our pick for the best free dictation app for Apple users

Look no further than your Mac, iPhone, or iPad for one of the best dictation tools. Apple's built-in dictation feature, powered by Siri (I wouldn't be surprised if the two merged one day), ships as part of Apple's desktop and mobile operating systems. On iOS devices, you use it by pressing the microphone icon on the stock keyboard. On your desktop, you turn it on by going to System Preferences > Keyboard > Dictation , and then use a keyboard shortcut to activate it in your app.

If you want the ability to navigate your Mac with your voice and use dictation, try Voice Control . By default, Voice Control requires the internet to work and has a time limit of about 30 seconds for each smattering of speech. To remove those limits for a Mac, enable Enhanced Dictation, and follow the directions here for your OS (you can also enable it for iPhones and iPads). Enhanced Dictation adds a local file to your device so that you can dictate offline.

You can format and edit your text using simple commands, such as "new paragraph" or "select previous word." Tip: you can view available commands in a small window, like a little cheat sheet, while learning the ropes. Apple also offers a number of advanced commands for things like math, currency, and formatting. 

Apple Dictation price: Included with macOS, iOS, iPadOS, and Apple Watch.

Apple Dictation accuracy: 96%. I tested this on an iPhone SE 3rd Gen using the dictation feature on the keyboard.

Recommendation: For the occasional dictation, I'd recommend the standard Dictation feature available with all Apple systems. But if you need more custom voice features (e.g., medical terms), opt for Voice Control with Enhanced Dictation. You can create and import both custom vocabulary and custom commands and work while offline.

Apple Dictation supported languages: 59 languages and dialects .

While Apple Dictation is available natively on the Apple Watch, if you're serious about recording plenty of voice notes and memos, check out the Just Press Record app. It runs on the same engine and keeps all your recordings synced and organized across your Apple devices.

Best free dictation software for Windows

Windows 11 speech recognition (windows).

The interface for Windows Speech Recognition, our pick for the best free dictation app for Windows

Windows 11 Speech Recognition (also known as Voice Typing) is a strong dictation tool, both for writing documents and controlling your Windows PC. Since it's part of your system, you can use it in any app you have installed.

To start, first, check that online speech recognition is on by going to Settings > Time and Language > Speech . To begin dictating, open an app, and on your keyboard, press the Windows logo key + H. A microphone icon and gray box will appear at the top of your screen. Make sure your cursor is in the space where you want to dictate.

When it's ready for your dictation, it will say Listening . You have about 10 seconds to start talking before the microphone turns off. If that happens, just click it again and wait for Listening to pop up. To stop the dictation, click the microphone icon again or say "stop talking."  

As I dictated into a Word document, the gray box reminded me to hang on, we need a moment to catch up . If you're speaking too fast, you'll also notice your transcribed words aren't keeping up. This never posed an issue with accuracy, but it's a nice reminder to keep it slow and steady. 

To activate the computer control features, you'll have to go to Settings > Accessibility > Speech instead. While there, tick on Windows Speech Recognition. This unlocks a range of new voice commands that can fully replace a mouse and keyboard. Your voice becomes the main way of interacting with your system.

While you can use this tool anywhere inside your computer, if you're a Microsoft 365 subscriber, you'll be able to use the dictation features there too. The best app to use it on is, of course, Microsoft Word: it even offers file transcription, so you can upload a WAV or MP3 file and turn it into text. The engine is the same, provided by Microsoft Speech Services.

Windows 11 Speech Recognition price: Included with Windows 11. Also available as part of the Microsoft 365 subscription.

Windows 11 Speech Recognition accuracy: 95%. I tested it in Windows 11 while using Microsoft Word. 

Windows 11 Speech Recognition languages supported : 11 languages and dialects .

Best customizable dictation software

Dragon by nuance (android, ios, macos, windows).

The interface for Dragon, our pick for the best customizable dictation software

In 1990, Dragon Dictate emerged as the first dictation software. Over three decades later, we have Dragon by Nuance, a leader in the industry and a distant cousin of that first iteration. With a variety of software packages and mobile apps for different use cases (e.g., legal, medical, law enforcement), Dragon can handle specialized industry vocabulary, and it comes with excellent features, such as the ability to transcribe text from an audio file you upload. 

For this test, I used Dragon Anywhere, Nuance's mobile app, as it's the only version—among otherwise expensive packages—available with a free trial. It includes lots of features not found in the others, like Words, which lets you add words that would be difficult to recognize and spell out. For example, in the script, the word "Litmus'" (with the possessive) gave every app trouble. To avoid this, I added it to Words, trained it a few times with my voice, and was then able to transcribe it accurately.

It also provides shortcuts. If you want to shorten your entire address to one word, go to Auto-Text , give it a name ("address"), and type in your address: 1000 Eichhorn St., Davenport, IA 52722, and hit Save . The next time you dictate and say "address," you'll get the entire thing. Press the comment bubble icon to see text commands while you're dictating, or say "What can I say?" and the command menu pops up. 

Once you complete a dictation, you can email, share (e.g., Google Drive, Dropbox), open in Word, or save to Evernote. You can perform these actions manually or by voice command (e.g., "save to Evernote.") Once you name it, it automatically saves in Documents for later review or sharing. 

Accuracy is good and improves with use, showing that you can definitely train your dragon. It's a great choice if you're serious about dictation and plan to use it every day, but may be a bit too much if you're just using it occasionally.

Dragon by Nuance price: $15/month for Dragon Anywhere (iOS and Android); from $200 to $500 for desktop packages

Dragon by Nuance accuracy: 97%. Tested it in the Dragon Anywhere iOS app.

Dragon by Nuance supported languages: 6 languages and dialects in Dragon Anywhere and 8 languages and dialects in Dragon Desktop.  

Best free mobile dictation software

Gboard (android, ios).

The interface for Gboard, our pick for the best mobile dictation software

Gboard, also known as Google Keyboard, is a free keyboard native to Android phones. It's also available for iOS: go to the App Store, download the Gboard app , and then activate the keyboard in the settings. In addition to typing, it lets you search the web, translate text, or run a quick Google Maps search.

Back to the topic: it has an excellent dictation feature. To start, press the microphone icon on the top-right of the keyboard. An overlay appears on the screen, filling itself with the words you're saying. It's very quick and accurate, which will feel great for fast-talkers but probably intimidating for the more thoughtful among us. If you stop talking for a few seconds, the overlay disappears, and Gboard pastes what it heard into the app you're using. When this happens, tap the microphone icon again to continue talking.

Wherever you can open a keyboard while using your phone, you can have Gboard supporting you there. You can write emails or notes or use any other app with an input field.

The writer who handled the previous update of this list had been using Gboard for seven years, so it had plenty of training data to adapt to his particular enunciation, landing the accuracy at an amazing 98%. I haven't used it much before, so the best I had was 92% overall. It's still a great score. More than that, it's proof of how dictation apps improve the more you use them.

Gboard price : Free

Gboard accuracy: 92%. With training, it can go up to 98%. I tested it using the iOS app while writing a new email.

Gboard supported languages: 916 languages and dialects .

Best dictation software for typing in Google Docs

Google docs voice typing (web on chrome).

The interface for Google Docs voice typing, our pick for the best dictation software for Google Docs

Just like Microsoft offers dictation in their Office products, Google does the same for their Workspace suite. The best place to use the voice typing feature is in Google Docs, but you can also dictate speaker notes in Google Slides as a way to prepare for your presentation.

To get started, make sure you're using Chrome and have a Google Docs file open. Go to Tools > Voice typing , and press the microphone icon to start. As you talk, the text will jitter into existence in the document.

You can change the language in the dropdown on top of the microphone icon. If you need help, hover over that icon, and click the ? on the bottom-right. That will show everything from turning on the mic, the voice commands for dictation, and moving around the document.

It's unclear whether Google's voice typing here is connected to the same engine in Gboard. I wasn't able to confirm whether the training data for the mobile keyboard and this tool are connected in any way. Still, the engines feel very similar and turned out the same accuracy at 92%. If you start using it more often, it may adapt to your particular enunciation and be more accurate in the long run.

Google Docs voice typing price : Free

Google Docs voice typing accuracy: 92%. Tested in a new Google Docs file in Chrome.

Google Docs voice typing supported languages: 118 languages and dialects ; voice commands only available in English.

Google Docs integrates with Zapier , which means you can automatically do things like save form entries to Google Docs, create new documents whenever something happens in your other apps, or create project management tasks for each new document.

Best dictation software for collaboration

Otter (web, android, ios).

Otter, our pick for the best dictation software for collaboration

Most of the time, you're dictating for yourself: your notes, emails, or documents. But there may be situations in which sharing and collaboration is more important. For those moments, Otter is the better option.

It's not as robust in terms of dictation as others on the list, but it compensates with its versatility. It's a meeting assistant, first and foremost, ready to hop on your meetings and transcribe everything it hears. This is great to keep track of what's happening there, making the text available for sharing by generating a link or in the corresponding team workspace.

The reason why it's the best for collaboration is that others can highlight parts of the transcript and leave their comments. It also separates multiple speakers, in case you're recording a conversation, so that's an extra headache-saver if you use dictation software for interviewing people.

When you open the app and click the Record button on the top-right, you can use it as a traditional dictation app. It doesn't support voice commands, but it has decent intuition as to where the commas and periods should go based on the intonation and rhythm of your voice. Once you're done talking, Otter will start processing what you said, extract keywords, and generate action items and notes from the content of the transcription.

If you're going for long recording stretches where you talk about multiple topics, there's an AI chat option, where you can ask Otter questions about the transcript. This is great to summarize the entire talk, extract insights, and get a different angle on everything you said.

Not all meeting assistants offer dictation, so Otter sits here on this fence between software categories, a jack-of-two-trades, quite good at both. If you want something more specialized for meetings, be sure to check out the best AI meeting assistants . But if you want a pure dictation app with plenty of voice commands and great control over the final result, the other options above will serve you better.

Otter price: Free plan available for 300 minutes / month. Pro plan starts at $16.99, adding more collaboration features and monthly minutes.

Otter accuracy: 93% accuracy. I tested it in the web app on my computer.

Otter supported languages: Only American and British English for now.

Is voice dictation for you?

Dictation software isn't for everyone. It will likely take practice learning to "write" out loud because it will feel unnatural. But once you get comfortable with it, you'll be able to write from anywhere on any device without the need for a keyboard. 

And by using any of the apps I listed here, you can feel confident that most of what you dictate will be accurately captured on the screen. 

Related reading:

The best transcription services

Catch typos by making your computer read to you

Why everyone should try the accessibility features on their computer

What is Otter.ai?

The best voice recording apps for iPhone

This article was originally published in April 2016 and has also had contributions from Emily Esposito, Jill Duffy, and Chris Hawkins. The most recent update was in November 2023.

Get productivity tips delivered straight to your inbox

We’ll email you 1-3 times per week—and never share your information.

Miguel Rebelo picture

Miguel Rebelo

Miguel Rebelo is a freelance writer based in London, UK. He loves technology, video games, and huge forests. Track him down at mirebelo.com.

  • Video & audio
  • Google Docs

Related articles

Hero image with the logos of the best AI website builders

The 4 best AI website builders in 2024

A hero image with the logos of the best form builders

The 12 best online form builder apps in 2024

Hero image with logos of the best presentation software

The best presentation software in 2024

Illustration representing the best digital marketing tools.

40+ best digital marketing tools in 2024

Improve your productivity automatically. Use Zapier to get your apps working together.

A Zap with the trigger 'When I get a new lead from Facebook,' and the action 'Notify my team in Slack'

Top 10 Speech to Text Software in 2024

examples of speech to text software

"Words have power," they say. And now, with the remarkable advancements in speech to text, those words hold even greater significance. Imagine effortlessly converting spoken language into written text with just a few clicks or simple voice commands. It's no longer a far-fetched dream but a tangible reality that has reshaped our relationship with technology.

From capturing the essence of interviews to unleashing the creativity of writers to empowering individuals with hearing impairments, speech to text software has become an indispensable tool in our digital toolbox. This rapidly evolving technology has a plethora of options, making it essential to have an understanding of the market leaders.

This article has you covered. We have curated a list of the best speech to text software based on key features, unique selling propositions, advantages, and limitations to help you make an informed choice that fits your specific needs perfectly.

Table of Contents

Ibm watson speech to text, amazon transcribe, microsoft azure speech to text, nuance dragon, braina pro , speechmatics, apple dictation , language and dialect support , customization options, integration capabilities, pricing plans , user reviews and testimonials, free trials or demos , top 10 speech to text software of 2024.

Here are the best speech to text apps shaping how we convert voice into text.

Otter.ai, an innovative AI-powered speech to text software, is known for its precise transcription services. It uses ambient voice intelligence (AVI), a unique feature that enhances the tool's learning capabilities, improving accuracy as it is used more.

Key features

Live transcription: Changes voice to text instantly, aids work.

Voice sharing: Enables voiceprint exchange for easy collaboration.

Talk recording: Stores conversations, useful for reference and documents.

However, users should be mindful of a few limitations. Otter.ai has a monthly cap on transcription time and may delay the final text from an audio recording. Despite this, its robust features make it an exceptional choice for accurate speech to text conversions.

IBM Watson speech to text, a cloud-native solution on this list, is a unique AI-powered tool with impressive capabilities. It provides real-time transcription alongside an option for batch conversion of audio files, catering to various languages, audio frequencies, and output preferences.

Speaker Diarization: Differentiates speakers, currently in beta.

Watson Assistant Integration: Watson can be integrated with the Watson Assistant to process natural language questions directly.

Security and Deployment: Ensures data security, flexible deployment on cloud or on-premises

Compared to competitors, IBM Watson's cost may be a deterrent for some. The beta multi-speaker recognition feature's inconsistency could be a concern for users.

Despite its pricing and a few ongoing tweaks, IBM Watson speech to text is the best speech to text software that emphasizes accuracy, flexibility, and a user-friendly interface, making it an outstanding choice for businesses and individuals alike.

A standout in the speech to text software landscape, Amazon Transcribe is a cloud-based solution developed for app integration. It delivers remarkably accurate transcriptions, even from low-quality audio sources, a key advantage for environments like contact centers.

Vocabulary editing: Ensures consistent product names, simplifying transcript analysis.

Audio for apps: Facilitates direct integration into custom apps.

Speaker and channel recognition: Differentiates multiple speakers and annotates transcripts accordingly.

However, adding industry-specific vocabulary can be cumbersome, and transcriptions may need careful proofreading for accuracy. Regardless of these, Amazon Transcribe's unique features and applications make it an influential player in the AI speech to text landscape.

Microsoft Azure speech to text, part of the Azure cloud service, emerged as an advanced speech recognition platform in 2024. It utilizes deep neural network models to deliver real-time audio transcription and handle multiple speakers.

Domain-specific recognition: Identifies field-specific terms.

Proper noun adaptation: Adjusts to speech patterns, noises, and specialized vocab.

Microsoft integration: Works smoothly with all Microsoft products, improving convenience.

Azure's complicated setup may challenge users, requiring technical expertise to manage. Ultimately, Microsoft Azure speech to text represents cutting-edge voice recognition platforms, offering an unparalleled service for those seeking a powerful and adaptable speech to text solution.

Dragon Speech Recognition Solutions, owned by Nuance, is an advanced dictation application with powerful AI-based speech recognition capabilities. It offers two powerful products: Dragon Professional and Dragon Anywhere. Each designed to cater to different needs stands out in the dictation tools. Dragon Professional, intended for professional use, presents robust dictation and document management capabilities. 

High-speed dictation: Can take dictation at a typing speed of 160 words per minute with a 99% accuracy rate.

Custom word list import: Enhances recognition accuracy by incorporating commonly used words.

Audio file transcription: Transcribes audio files sent from a mobile app, facilitating document management.

However, users might find the user interface a tad outdated, and its recording transcription could be better. 

On the other hand, Dragon Anywhere is a fully functional Android and iOS mobile application. It provides a powerful dictation feature powered by cloud technology, syncing with the desktop Dragon software.

Both Dragon tools, despite some limitations, offer high-quality speech recognition and excellent accuracy, making them valuable assets in the speech to text environment.

Renowned for its exceptional dictation capabilities, Braina Pro is more than just a speech to text software. The software shines for its AI-based voice recognition, enabling dictation in over 90 languages with an impressive 99% accuracy.

Adaptive AI: Software learns from each interaction, enhancing speech understanding.

Multilingual: Unlike competitors, Braina supports nearly 90 languages.

Versatile Assistant: Braina Pro does various tasks, like setting alarms or web searching, not just dictation

Braina Pro is widely appreciated for its high accuracy and flexible capabilities despite the dated interface and subscription-only model. The software is compatible with Windows, iOS, and Android, and has a companion Android app for remote PC control, further enhancing user convenience.

A unique blend of AI and human expertise is what sets Verbit apart from other speech to text software. Specifically designed for enterprise and educational establishments, Verbit uses AI to enhance transcription and captioning.

Smart AI: Verbit uses speech models and neural networks to reduce noise, identify accents, and deliver accurate transcriptions.

Enterprise focus: Verbit enables collaboration, providing reliable service for businesses and schools.

Fast, Precise Service: High accuracy and speedy results, perfect for situations needing precision

Verbit may not offer real-time availability or customizable pricing, but their use of AI and human intervention guarantees precise transcriptions. It offers extensive video captioning tools and features real-time status updates, ensuring users can monitor their transcription process conveniently. Given its focus on accuracy and team use, it certainly earns its spot as one of the best speech to text software.

Speechmatics is a powerful AI-driven speech to text tool that relies on machine learning to convert spoken words into text. It stands out with its automatic speech recognition solution, applicable to both existing audio/video files and live use.

Accent Support: Speechmatics supports major English accents, versatile for global users.

Media Captioning: Provides captions for videos, useful for multimedia tasks.

Keyword Triggers: Lets users manage specific transcription keywords, adding extra utility

While the lack of a free version might be a setback, the speech recognition software still shines due to its robust AI performance. It offers one of the most accurate transcriptions in the industry, making it a strong contender for one of the top AI speech to text software.

Gboard, a popular keyboard app by Google, is a leading choice for Android users seeking reliable speech to text capabilities. With its hands-free voice typing and swipe functionality, Gboard transforms the typing experience on mobile devices.

Voice Typing: Gboard enables hands-free text dictation, great for fast messages or notes.

Emoji and GIFs: Integrated emoji and GIF search for interactive chatting.

Multilingual: Supports over 60 languages, reflecting Google's inclusive tech approach.

Gesture Control: Unique typing experience with gesture-based cursor control

Apart from some drawbacks, such as the lack of shortcut commands and occasional lag in recording audio, Gboard is still lauded for its easy-to-use design and various features. Especially noteworthy is the fact that it is free via voice control, making it accessible to a broad range of users. While it may not fully understand slang or colloquialisms, its overall efficiency as the best dictation software is undeniable.

Apple Dictation, a powerful tool with Apple's operating systems, shines as a free and convenient speech to text software for Apple devices. Known for its seamless integration and dependable accuracy, Apple Dictation is supported by the technology behind Siri, Apple's voice-controlled assistant.

Keyboard Dictation: Transforms voice to text in any typing application, boosting productivity.

Audio Sharing: Users can share audio recordings, increasing versatility.

Multi-Language: Though mainly U.S. English-focused, it supports other languages, serving a broad user base.

Although the software is not ideally suited for longer dictations, it excels in transcribing short notes and controlling functions using voice commands. The dictation software remains a powerful tool integrated into Apple's ecosystem, providing an efficient and free solution to transcribe text on Mac devices by activating voice control. 

Tips for Choosing the Right Speech to Text Software

If you're a student, content creator, or executive needing speech to text software, picking the right one is key. Here are some tips for your decision:

Accuracy is paramount when it comes to speech to text software. Look for software that boasts high accuracy rates in transcribing speech to text. User reviews and testimonials can provide valuable insights into the accuracy of different software options.

The software should support a wide range of languages and dialects. It's essential for users who may need to transcribe content in multiple languages or work with a multilingual team.

Users should look for software that allows for the personalization of voice commands and the creation of custom vocabularies. This feature can enhance efficiency and user experience, particularly for users who frequently use industry-specific terminology.

The software should seamlessly integrate with other applications and platforms users already use. This facilitates a smooth workflow and improves productivity.

Pricing plans play a vital role in the selection process. The software should offer competitive pricing without compromising on features and functionality.

Users should explore reviews and testimonials from others to gain insights into user satisfaction and the software's performance in real-world scenarios.

Users should take advantage of free trials or demos to test the software. This can help users assess if the software fits their needs before purchasing.

examples of speech to text software

In the grand symphony of progress, speech to text software has emerged as a brilliant maestro, harmonizing the spoken word with the written, elevating the melody of communication. Each tool, unique in its composition, caters to diverse rhythms and needs. However, remember, the perfect software is the one that orchestrates your voice most harmoniously.

What is speech to text?

Speech to text is a technology that converts voice commands into written words, commonly used for transcription, voice assistants, and accessibility.

What are the benefits of using speech to text software?

Speech to text software enhances productivity, provides accessibility for individuals with hearing impairments, aids in transcribing meetings or interviews, and facilitates the hands-free operation of devices.

Can speech to text software accurately transcribe accents and dialects?

Yes, advanced speech to text software can transcribe accents and dialects with varying degrees of accuracy, improving with machine learning and diverse training data.

Can I use speech to text software on my mobile device?

Yes, many speech to text software options are available on mobile devices, such as Google's Gboard, Windows speech recognition software, and various standalone apps like Otter.ai.

You should also read:

examples of speech to text software

How to Dictate Text on Android Devices 

examples of speech to text software

How Speech Recognition is Changing Language Learning

examples of speech to text software

Future of AI in Speech Recognition 

The 9 Best Speech-to-Text Software in 2024 (Ranked)

examples of speech to text software

You talkin' to me? Well, your words just got a whole lot more powerful. 

Today, we're talking about speech-to-text software that's got your back when you want to get those thoughts from your mouth to the page. 

(All without having to use your mammalian digits — what is this, 1985?)

We’ll cover: 

  • What is speech-to-text software? 
  • The best 9 in the business
  • What should you look for in speech-to-text
  • Common-use cases for speech-to-text 

Best practices for speech-to-text tools

  • A detailed breakdown of the best 9 tools

Let’s get started!

What is speech-to-text software?

Speech-to-text software is like having your own personal secretary who listens to the words you speak and instantly writes them down. Instead of typing everything out on your keyboard, you can just open your mouth and get talking. 

This type of software uses fancy AI with natural language processing (NLP) to translate your speech into text on the screen.

Pretty neat, huh? With speech recognition software, you can compose emails, write essays, fill out forms, update social media, and much, much more — just by talking. 

The options today are very advanced compared to even a few years ago. Many are over 95% accurate, can translate multiple languages, adapt to your voice and vocabulary over time, and some even come with voice commands so you can edit, punctuate, and format using speech alone. 

The best 9 speech-to-text software tools

Looking for the shortlist version? We’ve got your back: 

  • Lindy : Lindy is an all-purpose AI-powered virtual army with 99%+ accuracy speech-to-text recognition, effortlessly turning your spoken words into text. ‍
  • Otter.ai : Otter Voice Notes is your go-to for effortless transcription of lectures, meetings, or important audio across Android and computers. ‍
  • Apple Dictation : Apple Dictation provides a hands-free way to dictate text for messages, social media, or web searches on your iOS device. ‍
  • Just Press Record : Just Press Record is a no-frills solution for easy recording of lectures, interviews, or meetings, offering offline transcription. ‍
  • Windows 10 Speech Recognition : Control your Windows 10 computer and Cortana with your voice using the built-in speech recognition. ‍
  • IBM Speech to Text : IBM Speech to Text offers powerful and customizable transcription that works seamlessly across multiple devices. ‍
  • Speechnotes Pro : Speechnotes Pro is the perfect note-taking companion for students and professionals, allowing you to type, dictate, record, and sync with OneNote. ‍
  • Transcribe : Transcribe provides a well-rounded speech-to-text experience with timed recordings, transcription tools, and cloud storage for easy access. ‍
  • Braina Pro : Braina Pro delivers versatile voice control across various apps, along with a scheduler, memo manager, and other useful tools.

What should you look for in speech-to-text software? 

When evaluating speech-to-text tools, accuracy is obviously priority numero uno.  

Otherwise, do you really want to end up with a document that says, “Explode my client list” when you actually said, “Export my client list”?

  • Versatility matters. Can your software roll with the punches? We looked for speech-to-text tools that play nicely with different apps, systems, and whatever curveballs life throws at them. ‍
  • Don't make me think too hard. Nobody wants to wrestle with a complicated interface. All the options here are easy to use — even your tech-challenged great-grandma could figure them out. ‍
  • Lost in translation? Not here. Most of these tools offer a decent (or seriously impressive) range of languages, so you can go global with your audio creations. ‍
  • Voice commands are awesome and necessary. Imagine telling your software to throw in some commas or capitalize a whole sentence. Dictation power moves, anyone? ‍
  • Accuracy matters more than you think. Typos are the worst. These tools are all top-notch in the accuracy department, so your words come out just the way you intended. ‍
  • Compliance (but in a good way). Looking for a tool that aligns with your professional needs? You’re going to need HIPAA-compliant (or similar) tools if you’re a doctor or therapist, for example. We threw in one of those. 

Common use cases for speech-to-text software

Now you’re probably wondering, “What exactly can I use this for?” 

There are loads of practical use cases for speech-to-text tools:

  • Ditch the keyboard, doc: Medical professionals can streamline note-taking, transcribe patient consultations, and generally save their poor fingers from endless typing. ‍
  • A good time to be a student (except for the debt): No more cramming in frantic note-taking sessions after lectures. You can turn any recording or speech note into text, easy-peasy.  ‍
  • Accessibility win: Speech-to-text tools can also help the hearing impaired by neatly transcribing the contents of speech with very few mistakes.  ‍
  • Go full multitasking: Emails, grocery lists, random ideas... dictate them all while driving, cooking, or folding laundry. ‍
  • Let your author flag fly: Got a brilliant novel idea? Dictate your first draft while pacing around dramatically — it's the writer's way. The best AI-powered software may also pitch in with a few ideas of its own!

So, you’ve decided to give this whole speech-to-text thing a whirl, eh? Before you dive in, there are a few tips to keep in mind to make sure your experience goes as smooth as a Slip N’ Slide. 

  • Don’t speak as if you were talking to a robot. It can be tempting to over-enunciate, but avoid sounding like a robot. Speak clearly, but keep your normal speech rhythm and flow. Take normal pauses — don’t try to cram it all into one breath.  ‍
  • Check before you sign off. Most tools will give you a chance to review and edit the text before saving it. Do a quick scan to make sure everything looks right. If it transcribed “anomaly” as “a llama,” you’ll want to catch that. Make minor corrections as needed. The more you review and correct, the more your program will learn your voice and get better at understanding you. ‍
  • Use shorter voice commands. Many speech-to-text tools offer voice commands to help you navigate and edit your work. Get familiar with options like “start over,” “delete that,” “comma,” “period,” “new paragraph,” and “undo.” Using voice commands will save you time and frustration compared to manually correcting the text.
  • Learn how to punctuate out loud. It can feel silly at first, but say things like “period,” “question mark,” “exclamation point” and “comma” to properly punctuate your work. Your tool may allow for shortcut commands like “period, space” to end a sentence with proper spacing. If you don’t punctuate as you go, you’ll end up with a wall of text and have to go back and edit it all in. The best tools can add punctuation on their own, though you’ll have to review their input. 

examples of speech to text software

Lindy is not just a speech-to-text tool, it’s the overall best AI assistant tool out there. ‍

Whether you're drafting emails, brainstorming ideas, or just need a break from the keyboard, Lindy can take a huge load off your back: 

  • Over 99% accuracy: Lindy's AI engine is trained to understand natural language, minimizing those frustrating typos and misheard words — even if you’ve got an accent or speak in complex professional lingo. ‍
  • It plays well with other tools: Works hand-in-hand with your favorite text editors, note-taking apps, and over 3000 productivity tools — no clunky workarounds required. ‍
  • Supports 50+ languages: And you may be thinking “I have a difficult accent.” Not an issue with Lindy. ‍
  • A time-saving miracle: Dictating is often way faster than typing, so you can get your thoughts down quickly and efficiently — potentially getting back hours every day. ‍
  • Learns as you go: Lindy adapts to your unique speech patterns and vocabulary over time, improving accuracy with every use. ‍
  • Safe and secure? Yes! If you’re a medical professional, Lindy has HIPAA and PIPEDA compliance to keep patient information under lock and key.  ‍
  • More than just talk-to-text: Lindy can generate summaries of your dictations, helping you quickly grasp the main takeaways without replaying everything. ‍
  • Infinite potential: Lindy is an all-purpose tool that allows you to create “Lindies,” each tailored to a different task. The best part? These Lindies can talk to themselves. Imagine one summarizing your meetings while connecting with a scheduler Lindy, and automatically making a follow-up meeting!
  • Try out the 7-day free trial and then it’s just $49/mo . 

Let's be real: This is only just a tiny use-case for Lindy, which excels at creating an army of interconnected AI assistants that can handle… well, just about anything you throw at them, really. 

#2 Otter.ai

examples of speech to text software

Otter Voice Notes shines when you need to record lectures, meetings, or other important audio, then get it transcribed effortlessly.

  • Audio recording and easy transcription ‍
  • Works on Android devices and computers for cross-platform use ‍
  • Basic (Free): Limited minutes and features ‍
  • Pro ($8.33 per month billed annually): Increased minutes, custom vocabulary, and more ‍
  • Business (Contact for quote) : Collaboration features for teams

Things to keep in mind:

The free version might have limitations for heavy users.

#3 Apple Dictation

examples of speech to text software

Apple Dictation is the built-in solution for iOS users who want to dictate text for messages, social media, or web searches.

  • Hands-free control of your iOS device ‍
  • Works with Siri for even more voice commands ‍
  • Free (included with iOS devices) ‍
  • Limited to Apple devices only

#4 Just Press Record

examples of speech to text software

Need a no-frills solution for recording lectures, interviews, or meetings? Just Press Record does exactly what it says.

  • Easy one-button recording ‍
  • Offline transcription ‍
  • Adjustable playback speeds for review ‍
  • One-time purchase of $4.99 ‍

Might lack features for users needing advanced transcription options.

#5 Windows 10 Speech Recognition

examples of speech to text software

Windows 10 comes with built-in speech recognition , letting you control your computer with your voice.

  • Works with Cortana for extended commands ‍
  • Control your Windows device hands-free ‍
  • No additional software to install ‍
  • Free (included with Windows 10)

Accuracy may vary based on your hardware and accent.

#6 IBM Speech-to-Text

examples of speech to text software

IBM Speech to Text is a powerful solution for those who need accurate and versatile transcription. It boasts features for customization and works seamlessly across devices.

  • Accurate transcription with customizable models ‍
  • Works across multiple devices for flexibility ‍
  • Lite (Free): Limited usage ‍
  • Standard ($0.02 per minute): Increased limits and features ‍
  • Custom plans available for enterprise needs ‍
  • Pricing is usage-based, so costs can vary

#7 Speechnotes Pro

examples of speech to text software

Speechnotes Pro is designed with students and professionals in mind, offering a robust note-taking experience with seamless integration.

  • Type, dictate, and record all within the app ‍
  • Syncs with OneNote for streamlined organization ‍
  • Offers both online and offline functionality ‍
  • One-time purchase (price varies slightly by platform)

Might require some setup for optimal OneNote integration.

#8 Transcribe 

examples of speech to text software

Transcribe is great at providing a well-rounded speech-to-text experience with helpful tools and cloud integration.

  • Timed recordings for easy reference ‍
  • Transcription tools for editing and accuracy ‍
  • Cloud storage for cross-device access
  • Subscription options (weekly, monthly, yearly) ‍
  • May offer a free trial period

Subscription-based pricing could be a factor for some users.

#9 Braina Pro

examples of speech to text software

Braina Pro offers versatile speech recognition, giving you voice control across various apps.

  • Works with text, video, and photo apps ‍
  • Includes a scheduler, memo manager, and other useful tools ‍
  • Lifetime license: $79 ‍
  • Annual license: $49

Might have a steeper learning curve than simpler options.

And there you have it, folks — the best speech-to-text software options for 2024.  

Whether you're a student trying to take notes hands-free, a blogger pumping out articles at light speed, or an entrepreneur building a business without lifting a finger, these tools have got you covered. 

AI is rapidly advancing on its way to perfection, and these speech-to-text apps are only getting smarter, faster, and more accurate. 

Take Lindy for a spin with a 7-day free trial.

Put your life on autopilot.

Your ai medical scribe..

Privacy Policy

The Best (Free) Speech-to-Text Software for Windows

Looking for the best free speech-to-text software on Windows? We compare speech recognition options from Dragon, Google, and Microsoft.

Looking for the best free speech to text software on Windows?

The best speech-to-text software is Dragon Naturally Speaking (DNS) but it comes at a price. But how does it compare to the best of the free programs, like Google Docs Voice Typing (GDVT) and Windows Speech Recognition (WSR)?

This article compares Dragon against Google Docs Voice Typing and Windows Speech Recognition for three typical uses:

  • Writing novels.
  •  Academic transcription.
  • Writing business documents like memos.

Comparing Speech Recognition Software: Dragon Vs. Google Vs Microsoft

We will look at the nuances between the three below, but here's an overview on their pros and cons which will help you quickly make a decision.

1. Dragon Speech Recognition

Dragon Naturally Speaking beats Microsoft's and Google's software in voice recognition.

DNS scores 10% better on average compared to both programs. But is Dragon Naturally Speaking worth the money?

It depends on what you're using it for. For seamless, high-accuracy writing that will require little proof-reading, DNS is the best speech-to-text software around.

2. Windows Speech Recognition

If you don't mind proofreading your documents, WSR is a great free speech-recognition software.

On the downside, it requires that you use a Windows computer. It's also only about 90% accurate, making it the least accurate out of all the voice recognition software tested in this article.

However, it's integrated into the Windows operating system, which means it can also control the computer itself, such as shutdown and sleep.

3. Google Docs Voice Typing

Google Docs Voice Typing is highly limited in how and where you use it. It only works in Google Docs, in the Chrome Browser, and with an internet connection.

But it offers several options on mobile devices. Android smartphones have the ability to transcribe your voice to text using the same speech-to-text engine that also works with Google Keep or Live Transcribe.

And while Dragon Naturally Speaking offers a mobile app, it's treated as a separate purchase from the desktop client.

Dragon and Microsoft work in any place you can enter text. However, WSR can execute control functions whereas Dragon is mostly limited to text input.

Download : Live Transcribe for Android (Free)

Speech-to-Text Testing Methods

In order to test the accuracy of the dictation with the tools, I read aloud three texts:

  • Charles Darwin's "On the Tendency of Species to Form Varieties"
  • H.P. Lovecraft's "Call of Cthulhu"
  • California Governor Jerry Brown's 2017 State of the State speech

When a speech-to-text software miscapitalized a word, I marked the text as blue in the right-column (see graphic below). When one of the software got a word wrong, the misspelled word was marked in red. I did not consider wrong capitalizations to be errors.

I used a Blue Yeti microphone which is the best microphone for podcasting  and a relatively fast computer. However, you don't need any special hardware. Any laptop or smartphone transcribes speech as well as a more expensive machine.

Test 1: Dragon Naturally Speaking Speech-to-Text Accuracy

Dragon scored 100% on accuracy on all three sample texts. While it failed to capitalize the first letter on every text, it otherwise performed beyond my expectations.

While all three transcription suites do a great job of accurately turning spoken words into written text, DNS comes out way ahead of its competitors. It even successfully understood complicated words such as "hitherto" and "therein".

Test 2: Google Docs Voice Typing Speech-to-Text Accuracy

Google Docs Voice Typing had many errors compared to Dragon. GDVT got 93.5% right on Lovecraft, 96.5% correc t for Brown, and 96.5% for Darwin. Its average accuracy came out to around 95.2% for all three texts.

On the downside, it automatically capitalized a lot of words that didn't need capitalization. It seems the engine also hasn't improved in accuracy since I last tested GDVT three years ago.

Test 3: Microsoft Windows Speech Recognition Text-to-Speech Accuracy

Microsoft's Windows Speech Recognition came in last. Its accuracy on Lovecraft was 84.3% , although it did not miscapitalize any words like GDVT. For Brown's speech, it got its highest accuracy rating of around 94.8% , making it equivalent to GDVT.

For Darwin's book, it managed to get a similarly high score of 93.1% . Its average accuracy across all texts came out to 89% .

Related: The Best Free Text-to-Speech Tools for Educators

Are Free Transcription Services Worth Using?

  • Dragon Naturally Speaking got a perfect 100% accuracy for voice transcription.
  • Microsoft's free voice-to-text service, Windows Speech Recognition scored an 89% accuracy.
  • Google Docs Voice Typing got a total score of 95.2% accuracy.

However, there are some major limitations to free text-to-speech options you should always keep in mind.

GDVT only works in the Chrome browser. On top of that, it only works for Google Docs. If you need to enter something in a spreadsheet or in a word processor other than Google Docs, you are out of luck.

Our test results indicate it is more accurate than WSR, but you have to keep in mind that it only works in Chrome for Google Docs. And you will always need an internet connection.

WSR can make you more productive with its hands-off computer automation features. Plus, it can enter text. Its accuracy is the weakest out of the services that I tested.

That said, you can live with its misses if you are not a heavy transcriber. It's on par with Google Docs Voice Typing but limited to Windows.

For most users, the free options should be good enough. However, for all those who need high levels of transcription accuracy, Dragon Naturally Speaking is the best option around. As an occasional user, if you need a free service, Google Docs Voice Typing is a viable alternative.

These tools prove that your voice can make you more productive. Now, try out Google Voice Assistant  which is the best voice-control assistant you can use right now to manage everyday tasks.

Plus, be sure to check out these free online services to download text to speech as MP3 .

  • Meta Quest 4
  • Google Pixel 9
  • Google Pixel 8a
  • Apple Vision Pro 2
  • Nintendo Switch 2
  • Samsung Galaxy Ring
  • Yellowstone Season 6
  • Recall an Email in Outlook
  • Stranger Things Season 5

The best speech-to-text software for 2022

If you’re looking to take your productivity up a notch (or if you’re just a really slow typist), the best speech-to-text software is a sure way to do it. The idea is pretty simple: You speak, and the software detects your words and converts them into text format. The applications are nearly endless, from dictating thoughts and jotting down notes to creating long-form documents without having to type a word yourself. Yet despite this, not many businesses and professionals are taking full advantage of what speech-to-text software can give them.

Dragon Anywhere

Amazon transcribe, google docs voice typing.

The good news is that the best speech-to-text software doesn’t have to cost an arm and a leg — or anything at all, depending on your needs. There’s a handful of noteworthy services out there, though, and selecting the right one is important. That’s where we come in. Below, we’ve rounded up the best speech-to-text software platforms out there, with our picks covering a wide spectrum of platforms, features, and price points.

  • Price: $15 per month or $150 per year
  • Free Trial: Yes
  • Platforms: iOS, Android
  • Voice editing and formatting
  • Cloud-based storage and file sharing
  • AI learning adapts to your speech

If you’re already somewhat familiar with the best speech-to-text software then there’s a good chance you’ve heard of Dragon. Dragon Anywhere is a dedicated mobile speech-to-text app that delivers a high degree of accuracy thanks to its industry-leading speech recognition software that can adapt to your own speech patterns. In other words, Dragon Anywhere can actually learn  how you speak, right down to your sentence cadence and word pronunciation. In the off-chance that it does make a mistake, you can edit and format using just your voice. Dragon Anywhere also allows for continuous dictation with no word limits or length cut-offs, and your text documents are stored in the cloud for easy access and sharing with colleagues when you need to.

  • The best business laptops from Apple, Lenovo, Dell, and more
  • The Best Hiring Apps for Recruiters
  • 15 best online jobs for teens in 2022

Dragon Anywhere is by far the best speech-to-text software for mobile users, given that it’s designed entirely for use on iOS and Android devices, making it the ideal choice for translators, lawyers, accountants and other professionals who need to turn spoken dialog into written notes. It’s a bit like having a virtual stenographer. Plus, it’s useful for anybody else who wants to be able to “jot” things down hands-free. Its cloud-based sharing makes Dragon Anywhere great for group work, too.

Dragon Anywhere is a paid service with monthly and yearly subscription plans. You can pay on a monthly basis for $15, although if you like the service, then the $150 annual subscription is a better value (basically getting you two months free each year). If you want to give it a try first, there is a free one-week Dragon Anywhere trial available as well. There are Dragon software suites available for business users on Windows, and Dragon Anywhere syncs with them seamlessly. You also get a Dragon Anywhere subscription at no additional cost — a $150 value — with the Dragon Home and Dragon Professional desktop versions, which might be a better value depending on your needs.

  • Price: Starts at $0.024 per minute
  • Free Trial: Yes, Free Tier provides 60 audio minutes monthly for the first 12 months
  • Platforms: Most devices with a microphone
  • HIPAA- eligible and compatible with electronic health record systems
  • Integrates with AWS cloud services
  • Call Analytics extracts data and insights from customer interactions

If you need a more enterprise-grade solution, then Amazon Transcribe is one of the best speech-to-text software services for businesses large and small. It’s designed to integrate seamlessly with Amazon Web Services, so if your website and/or company already uses any of these, then setup should be a breeze. You can create text documents, transcribe conversations and videos, translate speech, and more. What really sets Amazon Transcribe apart from other speech-to-text apps (aside from its AWS integration) is its bevy of great features tailored for professional environments.

For instance, its Call Analytics feature can automatically extract useful insights from customer interactions, allowing you to tune and tailor your customer service. It’s also HIPAA-eligible and compatible with electronic health record systems for easy uploading and management of medical transcriptions and other patient data. Amazon Transcribe is purpose-built for businesses, especially larger enterprises (not to mention organizations such as hospitals), which should come as no surprise given its integration with Amazon Web Services.

Compared to other dictating software, Amazon Transcribe’s pricing structure is somewhat unique in that its monthly subscription fee is based on how many audio minutes you use, with plans starting at $0.024 per minute and scaling down in price per minute for the higher tiers. If you’re looking for the best speech-to-text software for professional business applications, Amazon Transcribe is hard to beat.

  • Price: $79 for yearly subscription, $200 for lifetime
  • Free Trial: Yes, basic free plan available
  • Platforms: Windows; companion app available for iOS and Android
  • Understands more than 100 languages
  • Acts as a virtual assistant for your PC
  • Remote PC control through Android or iOS mobile devices

If Dragon and Amazon Transcribe are overkill for your needs, Braina is one of the best speech-to-text software suites for individual users. We named it the best multipurpose program in our roundup of the best dictation software , as Braina can be considered more of a virtual assistant for your PC rather than a simple speech-to-text app. Think of it as being much like Siri or Alexa , but more focused on productivity (and much more powerful and versatile in this regard) while being also capable of excellent speech-to-text functions thanks to its impressive speech recognition A.I. that understands more than 100 languages.

If you feel like you could use a hand around the office but don’t want to actually hire a personal assistant, Braina might be worth a go. It’s one of the best speech-to-text software choices for small businesses, home offices, and individual users thanks to its excellent speech recognition capabilities and other features. Perform internet searches, dictate documents, translate different languages, record calls and meetings, set alarms and calendar reminders, sort through your files — you name it. Braina’s companion app even lets you do everything remotely via your iOS or Android phone or tablet when you’re away from your computer.

One major drawback of Braina is that the core software only works on Windows, the aforementioned iOS and Android companion app notwithstanding. Also, multiple people can use Braina without having separate accounts or subscriptions, which is a nice change of pace from most subscription-based software suites. There is a basic free plan available as well. If you want to unlock the full set of features, though, such as non-English language compatibility, then Braina will set you back $79 yearly or $200 for a lifetime key.

  • Price: Free
  • Platforms: Windows, Mac, and Linux (browser-based)
  • If you have a Google account, you already have it
  • Automatically converts text into document format
  • Cloud-based

You might already have access to one of the best speech-to-text software apps without even knowing it, as Google Docs has one build right in. Google’s browser-based word processor (part of the broader Google Drive suite of cloud-based office software) features a Voice Typing feature, and if you have a Google account and a working mic, then you’re already set up to use it. You don’t have to pay a cent for it, either, and for free software, it’s pretty good — although it naturally lacks many of the advanced features and dictation functions of the best speech-to-text software we outlined above.

Google Docs Voice Typing is very simple: You speak into your microphone, and Google Docs dumps the text into a document. It costs nothing to use, so if you’re on the fence about whether you need speech recognition at all, then Google Docs Voice Typing is a free way to try it out before you shell out any cash for any of the best speech-to-text software suites that you have to pay for. Voice Typing is great for those who just need basic dictating software without the bells and whistles offered by paid services, as well.

Since Google Docs is browser-based, you shouldn’t have to worry about platform compatibility. It’s naturally best for use on a computer rather than a mobile device; that said, you can really use it on any device with a microphone and access to Google Docs. Everything you do with Google Docs Voice Typing is automatically stored on the cloud, too, just like any other document you’d create or edit using Google Docs. The Google Drive cloud also makes it easy to share your transcriptions with friends and colleagues if you want.

Editors' Recommendations

  • The 5 best tax software suites for individuals in 2024
  • The best free antivirus software for 2023
  • The best accounting software for your small business
  • The best way to hire employees in 2022
  • The best CRM software for your business in 2022

Lucas Coll

Knowing the best way to hire employees is an important part of finding great employees online fast. However, when it comes to doing so quickly, there can be differences involved in finding the most appropriate approach. That's why we've got all the best insight into the four key ways to find employees online fast.

When time is of the essence, it's important to know exactly what to do so that you're not stuck waiting too long to employ the right candidate for your business. Time is money and if you're short on staff, you need to be able to fill those vacancies quickly. Having said that, you still want the best candidates which is why it's important to go about it the right way. Some ways are more obvious than others but this is the time for efficiency so you get the best value proposition.

Communication is an essential part of doing business online, from the simplest calls and text messages to large-scale video conferences involving dozens or even hundreds of people. Unfortunately, most of the free communication apps most of us use every day aren't really built for anything other than simple messaging and therefore aren't up to meeting the demands of modern companies.

That's why any small business looking to streamline its operations in the digital age should invest in a more comprehensive Voice over Internet Protocol (better known as a VoIP) service. But if you don't even know where to start with this, don't fret. We've got everything you need to know about the best VoIP services for small businesses to set you and your burgeoning enterprise sailing in the right direction. RingCentral

Voice over Internet Protocol, or VoIP, is a popular alternative to landlines, especially in the business world. VoIP providers deliver digital telephone services that rely on the internet for voice and video calls. The main advantages of VoIP are that you can make long-distance calls at a very affordable price and benefit from a faster connection compared to a traditional landline. 

A VoIP service is worth considering if you run a small business or make a lot of international phone calls, but comparing different VoIP providers can be challenging if you’re not familiar with the technology. We’ve compared different VoIP services to help you find the best provider to fit your needs. RingCentral

The Best Speech-to-Text Apps and Tools for Every Type of User

You don't need to use your fingers when you can type by talking with the best dictation software we've tested. it's fast, easy, and helps people who otherwise can't type..

Justin Pot

Typing isn't easy or even possible for everyone, which is why you might prefer to talk. Speech-to-text software, also sometimes called dictation software, makes it possible, by turning what you say into typed text.

Speech-to-text software is different from voice control software, although some apps do both. Voice control is the accessibility feature that lets you open programs, select on-screen options, and otherwise control your device using only your voice. Both macOS and Windows have voice control included. It's called VoiceOver on macOS and Speech Recognition in Windows.

Don't confuse speech-to-text software with transcription software , either, even if the categories overlap. Transcription software is typically for transcribing meetings or recordings, sometimes of multiple people, and generally after the fact. Dictation software, meanwhile, is a way to use your voice to type in real time. You talk to your computer or mobile device and immediately see the words on the screen. You can add punctuation by saying the name of the punctuation out loud—for example, "period," "comma," or "open quote" and "end quote."

Speech-to-text features or apps also should not be confused with text-to-speech tools , sometimes known as screen readers, which read text on the screen to you aloud.

Most people don't need to install software to dictate text to their computer or phone. That's because every major operating system has a speech-to-text feature built in, and they work about as well as anything else on the market. Here we point out where to find these features on your device, and talk about a powerful commercial product with more features, should you need to do more with a speech-to-text tool than the built-in options offer.

examples of speech to text software

Speech (Windows)

Windows Speech, often referred to as voice typing, was among the most accurate tools I tested for this article. Both Windows 10 and Windows 11 come with Speech, which you can try out using the keyboard shortcut Windows Key-H anywhere you can type. Up pops a window with a microphone icon. Tap the microphone and start talking. Text shows up more or less in real time. 

You can add punctuation manually using commands , or you can try the experimental auto-punctuation feature. As a writer, I prefer adding punctuation manually—I'm pretty particular about my punctuation—but the automated feature worked fairly well and I could imagine it being good enough for some people. See our complete guide to learn more about using speech recognition and dictation in Windows .

examples of speech to text software

Dictate (Microsoft Office)

You can dictate text in Microsoft Office by clicking the prominent Dictate button in all versions of Word, Powerpoint, OneNote, and Outlook. This brings the excellent engine Microsoft offers all Windows users, complete with the auto-punctuation feature, to just about every major operating system—the web, Android, iOS, and macOS versions of Office all include this dictation feature. It's great news if you use one of those systems and don't love the built-in speech-to-text engine.

examples of speech to text software

Dictation (macOS)

Apple has included Dictation in macOS since 2012. To enable the feature, head to System Settings > Keyboard and scroll down to Dictation, where you can also set a keyboard shortcut. Newer Macs have a dedicated function key that looks like a microphone (F5) to enable and disable dictation in the top row of the keyboard. The speech detection is very accurate and shows up in near real time. You can add punctuation with spoken commands . Potentially incorrect words are underlined in blue after you're done with dictation, and you can right-click or Command-click on them to see other potential options, similar to how spellcheck works. Note that Apple silicon Macs can do dictation for the most common languages offline, whereas Intel Macs send audio to Apple servers for processing.

Dictation (Mobile)

examples of speech to text software

Dictation (iOS)

If you use the default keyboard on the iPhone and iPad, there's a microphone icon to the left of the space bar (as shown in the image) or sometimes below the space bar on the right side, that you can tap to use dictation. It works almost exactly the same as on macOS. Tap that microphone key and a microphone icon will show up next to your cursor. Start talking and your text will appear. You can add punctuation and formatting using spoken commands , just like on the Mac. The text recognition is accurate, the same as on the Mac.

examples of speech to text software

Android's default keyboard, Gboard, also has a built-in dictation feature . Tap the microphone in the top-right corner of the keyboard and start talking. It works in any Android app where you can type text, and the recognition is quite accurate. You can add punctuation with spoken commands, like saying "comma" and "period," just like on other systems.

Google Docs Voice Typing

examples of speech to text software

Voice Typing (Google Docs)

Google Docs has a built-in dictation feature called Voice Typing . Google says it only works if you're using the Chrome browser, but by observation it works in Microsoft Edge and perhaps other Chromium-based browsers. Click Tools > Start voice typing and a large microphone icon appears, which you can click to start talking. Punctuation and formatting is handled by voice commands . Recognition works about as well as Gboard, which makes sense—they're likely using the exact same engine.

Dragon Professional

examples of speech to text software

Dragon Professional 16

Dragon is one of the most sophisticated speech-to-text tools. You use it not only to type using your voice but also to operate your computer with voice control. Dragon Professional, the most general version, isn't cheap at $699. A mobile-only version, Dragon Professional Anywhere, is a $15-per-month subscription with a one-week free trial. Additional versions of the software are available for use by legal, health care, and law enforcement professionals, with a focus on understanding the specialized language in those sectors. If you need a business-grade speech-to-text tool that's more powerful than the default software that comes with your operating system, Dragon is worth looking into.

More Inside PCMag.com

  • The Best Text-to-Speech Apps and Tools for Every Type of User
  • The Best Transcription Services for 2024

About Justin Pot

Justin Pot believes technology is a tool, not a way of life. He writes tutorials and essays that inform and entertain. He loves beer, technology, nature, and people, not necessarily in that order. Learn more at JustinPot.com .

More From Justin Pot

  • The Best Backup Software and Services for 2024
  • The Best Note-Taking Apps for 2024

examples of speech to text software

Advertisement

  • Accessibility and Aging
  • For maintaining independence

The Best Dictation Software

A person in front of a MacBook computer and a microphone using dictation software.

By Kaitlyn Wells

Dictation software makes it easy to navigate your computer and communicate without typing a single phrase.

This flexibility is great if you simply need a break from your keyboard, but it’s especially important for people with language-processing disorders or physical disabilities. Firing off a quick text or typing a memo can be difficult—or even totally infeasible—if you have limited hand dexterity or chronic pain, but this kind of software can make such tasks a relative breeze.

After considering 18 options, we’ve found that Apple Voice Control and Nuance Dragon Professional v16  are more accurate, efficient, and usable than any other dictation tools we’ve tested.

Everything we recommend

examples of speech to text software

Apple Voice Control

The best dictation tool for apple devices.

Apple’s Voice Control is easier to use and produces accurate transcriptions more frequently than the competition. It also offers a robust command hub that makes corrections a breeze.

Buying Options

Upgrade pick.

examples of speech to text software

Nuance Dragon Professional v16

The best dictation tool for windows pcs.

Dragon Professional v16 is the most accurate dictation tool we tested for any operating system—but its hefty price tag is a lot to swallow.

But the technology behind dictation software (also called speech-to-text or voice-recognition software) has some faults. These apps have difficult learning curves, and the inherent bias that humans program into them means that their accuracy can vary, especially for people with various accents, sociolects and dialects like African American Vernacular English, or speech impediments. Still, for those able to work within the technology’s constraints, our picks are the best options available for many people who need assistance using a word-processing tool.

Apple’s Voice Control comes installed with macOS, iOS, and iPadOS, so it’s free to anyone who owns an Apple device. In our testing, it produced accurate transcriptions most of the time, especially for speakers with standard American accents. Competing tools from Google and Microsoft averaged 15 points lower than Apple’s software in our accuracy tests. Among our panel of testers, those with limited hand dexterity loved Voice Control’s assistive-technology features, which made it easy to navigate the OS and edit messages hands-free.

But while the experience that Voice Control provides was the best we found for Apple devices, it often misunderstood words or entire phrases spoken by testers with regional or other American accents or speech impediments such as stutters. Although such accuracy issues are expected for speech-recognition modeling that has historically relied on homogenous data sources , other tools (specifically, Nuance Dragon Professional v16 , which is available only for Windows) performed slightly better in this regard. Apple’s tool may also lag slightly if you’re running multiple processor-intensive programs at once, which our panelists said slowed their productivity.

At $700, Nuance Dragon Professional v16 is the most expensive speech-recognition tool we’ve found, but it’s the best option for people who own Windows PCs. Professional v16 replaces our previous Windows PC pick, the now-discontinued Nuance Dragon Home 15 . It offers added functionality for those working in finance, healthcare, and human services—and is probably overkill for most people. (If you need a free PC option, consider Windows Voice Recognition , but know it has significant flaws .)

Like its predecessor, Professional v16 involves a learning curve at first, but the Dragon tutorial does a great job of getting you started. Our panelist with language-processing disabilities said Dragon was one of the most accurate dictation options they tried, and the robust command features made it possible for them to quickly navigate their machine. Like our Apple pick, Dragon had trouble with various American dialects and international accents; it performed better for those testers with “neutral” American accents. It also struggled to eliminate all background noise, though you can mitigate such problems by using an external microphone or headset. Although Dragon produced the fastest transcriptions of any tool we tested, this wasn’t an unqualified positive: Half of our panelists said that they preferred slower real-time transcriptions to Dragon’s sentence-by-sentence transcription method because they found its longer pauses between sentences’ appearance on their screen to be distracting.

The research

Why you should trust us, who this is—and isn’t—for, how we picked and tested, the best dictation tool for apple devices: apple voice control, the best dictation tool for windows pcs: nuance dragon professional v16, other good dictation software, how to use dictation software, should you worry about your privacy when using dictation software, the competition.

As a senior staff writer at Wirecutter, I’ve spent five years covering complex topics, writing articles focusing on subjects such as dog DNA tests , blue-light-blocking glasses , email unsubscribe tools , and technology-manipulation tactics used by domestic abusers . I was an early adopter of dictation software back in the early aughts, with a much less polished version of Nuance’s Dragon software. Like other people I interviewed for this guide, I quickly abandoned the software because of its poor performance and difficult learning curve. Since then, I’ve occasionally used dictation and accessibility tools on my devices to send quick messages when my hands are sticky from baking treats or covered in hair product from my morning routine. While writing this guide, I dictated about a third of the text using the tools we recommend.

But I’m not someone who is dependent on dictation tools to communicate, so I consulted a variety of experts in the AI and disability communities to better understand the role that this kind of software plays in making the world more accessible for people with disabilities. I read articles and peer-reviewed studies, I browsed disability forums that I frequent for advice on my chronic pain, and I solicited input from affinity organizations to learn what makes a great dictation tool. And I brushed up on the latest research in AI technology and voice-recognition bias from Harvard Business Review , the Stanford University Human-Centered Artificial Intelligence Institute , and the University of Illinois Urbana-Champaign Speech Accessibility Project , among others.

I also chatted with Meenakshi Das , a disability advocate and software engineer at Microsoft, and Diego Mariscal, CEO of the disabled-founders startup accelerator 2Gether-International , about the limitations of dictation tools for people with various disabilities. I discussed the ethics of artificial intelligence with Princeton University PhD candidate Sayash Kapoor . I attended a lecture by Kapoor’s advisor, Arvind Narayanan, PhD , entitled “ The Limits Of The Quantitative Approach To Discrimination .” I spoke with Christopher Manning , co-director of the Stanford Institute for Human-Centered Artificial Intelligence at Stanford University, about the evolution of dictation software. And I consulted with Wirecutter’s editor of accessibility coverage, Claire Perlman, to ensure that my approach to this guide remained accessible, nuanced, and reflective of the disability community’s needs.

Lastly, I assembled a testing panel of nine people with varying degrees of experience using dictation software, including several with disabilities ranging from speech impediments to limited hand dexterity to severe brain trauma. Our testers also self-reported accents ranging from “neutral” American to “vague” Louisianan to “noticeable” Indian.

Assistive technology such as speech-to-text tools can help you do everything from sending hands-free texts while driving to typing up a term paper without ever touching your keyboard.

We wrote this guide with two types of users in mind: people with disabilities who rely on dictation software to communicate, and people with free use of their hands who occasionally use these tools when they need to work untethered from their keyboard. However, we put a stronger focus on people with disabilities because dictation software can better serve that population and can ultimately make it easier for them to access the world and communicate.

Users with limited or no hand dexterity, limb differences, or language-processing challenges may find speech-recognition software useful because it gives them the freedom to communicate in their preferred environment. For example, our panelists with learning disabilities said they liked to mentally wander or “brain dump” while using voice-recognition software to complete projects, and they felt less pressure to write down everything perfectly the first time.

Still, our approach had limits: We focused on each tool’s ability to integrate with and edit text documents, rather than to verbally navigate an entire computer screen, which is a feature that some people with cerebral palsy, Parkinson’s disease, quadriplegia, and other neurological disabilities need—especially if they have no speaking issues and limited or no motor control. Our picks offer some accessibility features, such as grid navigation, text editing, and voice commands, that make using devices easier, but not everyone who tested the software for us used those features extensively, and the majority of voice-recognition software we considered lacks these premium options.

Aside from the absence of accessibility features, there are other potential hindrances to these software programs’ usefulness, such as how well they work with a range of accents.

The biases of dictation software

Speech-recognition software first became increasingly available in the 1980s and 1990s, with the introduction of talking typewriters for those with low vision , commercial speech-recognition software, and collect-call processing, according to Christopher Manning , co-director of the Stanford Institute for Human-Centered Artificial Intelligence . But “speech recognition used to be really awful,” he said. “If you were an English-Indian speaker, the chances of it [understanding you] used to be about zero; now it’s not that bad.”

As we found in our tests, an individual’s definition of “bad” can vary widely depending on their accent and their speaking ability. And our AI experts agreed that the limitations of the natural language processing (NLP) technology used in dictation software are laid bare when faced with various accents, dialects, and speech patterns from around the world.

Sayash Kapoor , a second-year PhD candidate studying AI ethics at Princeton University, said that NLP tools are often trained on websites like Reddit and Wikipedia, making them biased against marginalized genders and people from Black, indigenous, and other communities of color. The end result is that most dictation software works best with canonical accents, said Manning, such as British and American English. Our experts told us that some speech-to-text tools don’t have fine-grain modeling for different dialects and sociolects, let alone gender identity, race, and geographic location.

In fact, one study found that speech-to-text tools by Amazon, Apple, Google, IBM, and Microsoft exhibited “ substantial racial disparities ,” as the average word-error rate for Black speakers was nearly twice that of white speakers. This limitation affects not only how easily speakers can dictate their work but also how effectively they can correct phrases and give formatting commands—which makes all the difference between a seamless or painful user experience.

Inherent bias in speech-recognition tools extends to speech impediments, as well. Wirecutter approached several people with stutters or other types of speech and language disabilities, such as those resulting from cerebral palsy or Parkinson’s disease, about joining our panel of testers. But most declined, citing a history of poor experiences with dictation tools. Disability advocate Meenakshi Das, who has a stutter, said she doesn’t use any speech-to-text tools because more work needs to be done industry-wide to make the software truly accessible. (Das is a software engineer at Microsoft, which owns Nuance , the company that produces our pick for Windows PCs .)

Both Das and Kapoor have noticed a trend of accelerators working to close the bias gap for people with accents, speech impediments, and language-processing disabilities in order to make it possible for those groups to use dictation tools. In October 2022, for example, the University of Illinois announced a partnership with Amazon, Apple, Google, Meta, Microsoft, and nonprofits on the Speech Accessibility Project to improve voice recognition for people with disabilities and diverse speech patterns.

But until truly inclusive speech-to-text tools arrive, people in those underserved groups can check out our advice on how to get the most out of the software that’s currently available.

We solicited insights on speech-to-text tools from our experts and read software reviews, peer-reviewed studies, disability forums, and organization websites to learn what makes a great dictation tool.

We identified 18 dictation software packages and compared their features, platform compatibility, privacy policies, price, and third-party reviews. Among the features we looked for were a wide variety of useful voice commands, ease of navigation, the presence of customizable commands and vocabulary, multi-language support, and built-in hint tools or tutorials. Those programs that ranked highest on our criteria, generally offering a mix of robust features and wide platform availability, made our short list for testing:

  • Apple Dictation ( macOS , iOS , iPadOS )
  • Apple Voice Control ( macOS , iOS , iPadOS )
  • Google Assistant on Gboard
  • Google Docs Voice Typing
  • Microsoft Word Dictate
  • Nuance Dragon Home 15 (discontinued)
  • Windows Voice Recognition
  • Windows Voice Typing

We defaulted these tools to the American English setting and rotated using each tool for a couple of hours on our computers and mobile devices. Afterward, we graded their performance on accuracy, ease of use, speed, noise interference, and app compatibility. We placed an emphasis on accuracy rates, performing a series of control tests to see how well the dictation tools recognized 150- to 200-word samples of casual speech, the lyrics of Alicia Keys’s song “No One,” and scientific jargon from a peer-reviewed vaccine study . From there, we advanced the dictation tools with the highest marks to our panel-testing round.

Nine panelists tested our semifinalists over the course of three weeks. Our diverse group of testers included those with disabilities ranging from speech impediments to limited hand dexterity to severe brain trauma. They self-reported accents ranging from American to Catalan to Indian. All the panelists had varying degrees of prior experience with dictation software.

Meet our testers:

  • Aum N., 34, who works in quality assurance and has an Indian accent
  • Ben K., 41, an editor with a “moderate” stutter and a “standard” American accent
  • Chandana C., 64, an analyst with a “noticeable” Indian accent
  • Claire P., 31, an editor with a musculoskeletal disability called arthrogryposis
  • Davis L., 27, an audio producer with a “vague” Louisianan accent
  • Franc C. F., 38, a software engineer from Spain
  • Juan R., 52, who survived a car accident that caused severe brain trauma and now has limited short-term memory and limited reading comprehension
  • Polina G., 49, an engineering manager with ADHD
  • Vicki C., 33, a software engineer with a shoulder injury and repetitive stress injury

The panelists sent text messages, drafted emails, and coded software using the various speech-to-text tools, after which they provided extensive notes on their experiences and identified which tools they would feel comfortable using regularly or purchasing on their own.

To arrive at our picks, we combined the panelists’ experiences with the results of our control round, as well as recommendations from our experts.

Screenshot of a Microsoft Word document with text transcribed using Apple Voice Control.

Price: free Operating system: macOS, iOS, iPadOS Supported languages: 21 to 64 languages , depending on the settings, including Hindi, Thai, and several dialects of English and Italian.

Apple Voice Control is easy to use, outperforms major competitors from Google, Microsoft, and Nuance, and offers dozens of command prompts for a smoother experience, an especially helpful feature for people with limited hand dexterity. Because Voice Control is deeply integrated into the Apple ecosystem, it’s more accessible than many of the other tools we tested. It’s available for free in macOS , iOS, and iPadOS ; you can activate it by going to Settings > Accessibility on your preferred device. Once you activate it, you may notice that it works similarly to the Dictation and Siri functions on your phone. That’s because they use the same speech-recognition algorithms. This means the learning curve inherent to all speech-to-text tools is marginally less difficult with Voice Control, particularly if you’ve used Dictation or Siri before, as they’re already familiar with your speech patterns. (If you’re wondering how Dictation and Voice Control differ, Dictation is a speech-to-text tool that omits the various accessibility and navigation functions of Voice Control.)

In our tests, Voice Control routinely produced more accurate transcriptions than the competition, including Nuance Dragon, Google Docs Voice Typing , and Windows Voice Recognition . In our control tests, it was 87% accurate with casual, non-accented speech. Comparatively, Dragon was 82% accurate, while Windows Voice Recognition was only 64% accurate. Google Docs Voice Typing performed on a par with Voice Control, but it failed at transcribing contractions, slang, and symbols much more frequently. Most of the tools we tested, Voice Control included, were about 10% less accurate during our jargon-rich control tests that included scientific words from an immunology study. (One notable exception in this regard was Dragon, which showed no noticeable drop-off with more technical language.)

Chart comparing Apple Voice Control transcriptions with the original lyrics of a song.

Half of our testers agreed that they would regularly use Voice Control, and that they would even pay for it if they relied on dictation software. Specific words they used to describe the software included “accurate,” “good,” and “impressive.” Still, our real-world tests pushed Voice Control to its limits, and the software often misunderstood words or phrases from testers who had diverse accents or stutters. Unfortunately, such accuracy issues are to be expected for speech-recognition modeling that has historically relied on homogenous data sources. But Voice Control’s performance improves the more you use it , so don’t give up immediately if you find inaccuracies frustrating at first.

Apple’s assistive technology was a standout feature for our testers with limited hand dexterity, as it allowed them to navigate their machines and edit their messages hands-free. These command prompts have a challenging learning curve, so you’re unlikely to have a flawless experience out of the gate. But asking “What can I say?” brings up a library that automatically filters contextually relevant commands depending on your actions. For example, selecting a desktop folder produces a short list of prompts related to file access (such as “Open document”), while moving the cursor to a word-processing tool brings up “Type.” The interface allows you to quickly sort through the relevant commands, a feature that some panelists found useful.

Screenshot of Microsoft Word document with Apple Voice Control’s grid over it.

Flaws but not dealbreakers

Our panelists with accents experienced mixed accuracy results using Apple Voice Control. Testers with nonstandard English accents or speech impediments said that the performance of Apple’s software improved when they spoke slowly. “When using it to type, sometimes it got things quite off,” noted panelist Franc, a native Spanish and Catalan speaker who tested the software in English. Similarly, my own experience dictating this guide proved challenging: I found that I had to overenunciate my words to prevent Voice Control from capitalizing random words and mistyping the occasional phrase.

Our panelists agreed that Apple Voice Control was the slowest tool they tested for transcribing text, though that difference in speed was a matter of seconds, not minutes. Sometimes speech-recognition software processes a complete sentence, rather than single words, before displaying the text on the screen, a tendency that about half of our panelists found frustrating. “It was really distracting to wait to see whether [Voice Control] had picked up what I said,” noted tester Vicki, who has a repetitive stress injury that makes typing difficult.

Wirecutter’s editor of accessibility coverage, Claire Perlman, who also served on our panel, echoed this sentiment. She said the lag time was marginal at the start of her session but became noticeably painful the longer she used the software. Claire also noted that her 2019 MacBook Pro, equipped with a 1.4 GHz quad-core Intel Core i5 processor, overheated while running Voice Control for extended periods. “The lag that I’m experiencing now is very distracting and makes me feel like I have to slow my thought process in order to have it typed correctly,” she said. We attempted to replicate this issue with a 2019 MacBook Pro equipped with a 2.6 GHz six-core Intel Core i7 processor, and after an hour of use we found that Apple’s Speech Recognition process fluctuated between occupying 54% to 89% of our CPU and that Apple Dictation’s usage ranged from 1% to 35%, confirming that the robust platform requires a lot of processing power. That said, you may find that the lag disappears when you close other CPU-intensive programs, such as Chrome or a game.

As we previously mentioned, successfully wielding Voice Control’s command prompts requires experience and finesse. Testers who read through the quick-start guide and watched YouTube tutorials reported the easiest experience. “There is a learning curve,” said tester Chandana, who has an Indian accent. But the software’s “What can I say?” screen was a big help, Chandana said: “I was able to use many functions that I wanted to use before but did not know that I could.”

Lastly, Voice Control works best within Apple’s own apps, and some people may find that inherent limitation challenging or annoying. “I found it to be more accurate in Pages and iMessage than Google Docs and WhatsApp,” Claire noted. In just one example, although Voice Control correctly captured dictated commands such as “Select line” or “Delete ” in Pages , it couldn’t execute them in Google Docs.

Screenshot of a Microsoft Word document with text transcribed using Nuance Dragon Home 15.

Price: $700 per license Operating system: Windows Supported languages: English, French, Spanish (depending on purchase region)

Nuance Dragon Professional v16 is the best option for Windows PC users because it surpasses the Microsoft Word and Windows dictation tools in accuracy, quickly processes and displays transcriptions, and offers a helpful training module and selection of command prompts to get you swiftly up to speed. Unlike most other dictation software in our tests, it worked well with technical, jargon-heavy language, an advantage that could make it useful for people who work in the sciences. (While we only tested the now-discontinued Nuance Dragon Home 15 for this guide, Professional v16 uses the same technology while making it easier to dictate large amounts of data in a corporate setting. Plus, if you’ve used earlier versions of Dragon in the past, you’ll be happy to know that this version of Dragon represents a significant improvement over previous generations.)

Our panelists said that Dragon was one of the most accurate speech-recognition tools they tried, describing it as “extremely accurate,” “reliable,” and in at least one case, “flawless.” Wirecutter’s Claire Perlman, who has arthrogryposis , said, “I was truly blown away by the accuracy of Dragon. It had only two to three errors the whole time I used it.” Our control tests found similar results. Dragon was 82% accurate in transcribing casual speech (slightly behind Apple Voice Control, which produced 87% accuracy), and in transcribing technical language, it didn’t exhibit the steep decline in accuracy that we saw from other software, including Apple’s Voice Control and Dictation tools.

Chart comparing Nuance Dragon Home 15 transcriptions with the original lyrics of a song.

Dragon’s transcriptions appeared with minimal lag time on testers’ screens, whereas tools like Otter and Windows Voice Recognition took twice as long to produce phrases or sentences. But panelists found Dragon’s sentence-by-sentence transcription to be a mixed bag. Some testers preferred to see entire phrases or sentences appear simultaneously on the screen. “The speed combined with the accuracy meant that I did not feel like I had to pay constant attention to what was happening on the screen and could instead focus on my thoughts and writing,” Claire said. Other testers preferred real-time, word-by-word transcriptions: “There were definitely moments where I was sitting there drumming my fingers and waiting,” said Wirecutter editor Ben Keough. Dragon lets you adjust for less lag time or better accuracy by going to Options > Miscellaneous > Speed vs. Accuracy. But we didn’t notice a difference in performance when we changed this setting during our control tests.

Like all the dictation software we tested, Dragon requires a bit of know-how to get the most out of its features and achieve the best performance, but its multitude of accessibility voice commands were a favorite feature among our panelists. Unlike most of the options we tested, Dragon launches with a brief tutorial that walks you through how to use it, from setting up the best microphone position to dictating text to using punctuation prompts.

You can revisit the tutorial at any point if you need a refresher, which panelist Juan found helpful with his traumatic brain injury and short-term memory problems. “The tutorial gives you a good start on its functionality,” he said. Wirecutter’s Claire Perlman noted, “I used to use Dragon years ago, and back then, training the system to recognize your voice was an onerous process. This time, I found the whole setup and training process genuinely helpful and very quick. And I felt like I could really operate it hands-free.”

Screenshot of Dragon Home’s interactive tutorial and correction menu.

The biggest drawback to Dragon is that it costs $700 per license. The experts we spoke with said that this barrier to entry may make using this software infeasible for many people who are disabled, including those who are on a limited income because they can’t find remote work that accommodates their disabilities. Additionally, having to download and enable the software can be a hassle that reminds people with disabilities that their situation is an afterthought in the digital age—especially in comparison with Apple Voice Control or even Windows Voice Recognition, which are integrated into device operating systems.

This software is compatible only with the Windows desktop operating system; you can’t install it on Android, Apple’s operating systems, or ChromeOS. (That is, unless you partition your hard drive, but in that case you run the risk of slowing down the operating system, which one panelist with a drive partition experienced.) Users can subscribe to Dragon Anywhere ($150 a year), which works with iOS and Android devices. But because our panelists didn’t test Dragon Anywhere, we can’t comment on its usability or accuracy.

Dragon isn’t a speech-recognition tool that you can use right out of the box—the first time you load the software, it prompts you to complete a series of short tutorials. This means it’s important to set aside some time getting to know the program before rushing to write, say, an overdue memo or term paper. (That said, regardless of the speech-to-text tool you choose, we recommend familiarizing yourself with it before diving into a text-heavy project.)

Although Dragon was the most accessible and accurate Windows-compatible dictation software we tested, it still faltered in its transcriptions at times, especially for testers who didn’t use a dedicated microphone or headset. Nuance recommends buying its Dragon USB headset ($35) or Dragon Bluetooth headset ($150) for the best experience and says that users can improve the program’s accuracy rate by making corrections to text via voice prompt and running its Accuracy Tuning feature to optimize its language models. Judging from our testing, we can say that any high-quality dedicated mic that’s positioned correctly will improve your results. Even so, one panelist who used a wired headset noticed that Dragon could not capture diverse names like “Yeiser” but had no issues with traditionally Anglo names like “Brady.”

Finally, this dictation software is available in only three languages—English, French, and Spanish—a stark reminder that accessibility isn’t always accessible to all. Within those constraints, you can specify a language region to ensure that the spelling matches your preferred region, such as Canadian English versus American English. (The ability to purchase a preferred-language license may vary depending on where you live .)

If you want a free Windows-compatible option: Consider Windows Voice Recognition . In our tests, its accuracy rate was 64% compared to Dragon’s 82%, but like Dragon you can train Windows to better understand your voice the more you use it. Other free tools we tested that had subpar accuracy rates can’t be trained, including Google Docs Voice Typing .

Our panelists agreed that no dictation software is perfect, but for the most part, such programs’ functionality improves the more you use them. Here’s how to get the most out of your speech-to-text tool:

  • Take the tutorial. Seriously. Some of these tools have difficult learning curves, with specialized commands for numerals, punctuation, and formatting. Before dictating your memoir, make sure to review the software’s instruction manual and keep a list of its command shortcuts nearby.
  • Set your primary language. Less than half of the tools we tested allow you to set your primary language if it’s outside the country of origin. But if your tool has this option, make sure to use it. This can make the difference between the software transcribing theater or theatre , or even recognizing your accent at all.
  • For immediate accuracy, enunciate. For long-term success, speak naturally. Many dictation tools offer vocabulary builders or claim to learn your speech patterns over time, so don’t force yourself to sound like a machine—unless you want to use that stiff voice every time you dictate.
  • Consider a dedicated microphone. Speech-to-text tools, including our top picks, work better when you keep your mouth close to the microphone and work in a quiet environment. In general, you can cut out the majority of background disturbances and transcription misfires by using a dedicated external USB microphone or a wireless or wired headset that crisply captures your voice.
  • Pay attention to the on/off switch. Some of these tools go into sleep mode after a few seconds of silence, or they may pick up side conversations you don’t want to transcribe. If you pause to collect your thoughts or turn around to answer a colleague’s question, make sure the dictation tool is on the right setting before you speak.

You give up some privacy when you speak into a microphone so that a speech-to-text tool can transcribe your words. As is the case when you’re speaking on the phone, anyone nearby may hear what you say. And many dictation tools feed your audio into their learning algorithms to improve their service or to sell you something. In some cases, a company may even turn over all of your speech-to-text recordings and transcriptions to law enforcement. Ultimately, if you’re dealing with sensitive data and have another means to communicate—which we know isn’t possible for many people who need these tools—it’s best not to share your information with a speech-to-text program. Of course, we could say the same thing about sending unsecured texts or uploading documents into the cloud, too.

Here’s what the makers of our picks do with your data:

Apple’s Voice Control processes dictations and commands only locally, on your device , so no personal data is shared or saved with a third party. But some information that you speak into sibling programs Dictation and Siri may transmit to Apple’s servers. (Because many people, including several of our panelists, use Dictation and Siri, we concluded that the differences are worth calling attention to.)

Typically, Apple can’t access Dictation and Siri audio recordings that you compose on your device unless you’re dictating into a search box or the service requires third-party app access. Apple may collect transcripts of Siri requests, dictation transcripts, IP addresses, and contact information to perform app tasks, improve its services, and market its products. And anytime Apple interacts with a third-party app, such as a transcription service for meeting notes, that voice data may be sent to Apple, or you could be subject to that app’s separate terms and conditions and privacy policy. When you opt in to Apple’s “Improve Siri and Dictation,” the audio recordings and transcripts that Apple saves are accessible to its employees , and data is retained for two years, though the company may extend retention beyond that period at its discretion.

Apple also uses your audio and transcripts to market products and services. You can opt out of allowing Apple to review your audio files under System Settings ( Settings on mobile devices) > Privacy & Security > Analytics & Improvements ; you can delete your six-month history by going through System Settings ( Settings on mobile devices) > Siri & Search > Siri & Dictation History . With iOS 14.6, however, according to Gizmodo , Apple may still collect some analytics data even if you opt out.

As for information shared with third parties, certain providers must delete personal information at the end of the transaction or take undisclosed steps to protect your data. And Apple may disclose your information to law enforcement agencies as required by law.

Nuance, which owns Dragon software, routinely collects dictation data. The service can access any sensitive information you dictate, including medical records or proprietary information, and doesn’t always require your direct consent to do so. For example, in its privacy policy , Nuance says, “If we are processing personal data on behalf of a third party that has direct patient access, it will be the third party’s responsibility to obtain the consent.” And “snippets” of audio recordings are reviewed by people who manually transcribe the data in order to improve Nuance’s services. Nuance retains data for three years after you stop using the services, and you can request that the company delete your data record .

Additionally, although Nuance collects electronic data such as your IP address and registration information to market its products, the company says it doesn’t sell customer data to third parties. However, Nuance affiliates and partners may have access to the data through its sales division or customer service division. And like Apple, Nuance may share personal data to comply with the law .

Beyond considering dictation software in particular, be sure to examine the data-retention policies of any software you’re dictating into (whether that’s Microsoft Word, Google Docs, or whatever else), which fall under the maker’s own privacy practices.

Apple Dictation ( macOS , iOS , iPadOS ) performs similarly to our pick, Apple Voice Control, but it lacks the robust features that many people want in a speech-to-text tool, including key command functions.

We can’t recommend Microsoft Word Dictate  or Otter due to their transcription lag times and subpar accuracy rates, which ranged from 54% to 76%, far behind Apple Voice Control’s 87% and Dragon’s 82%. Additionally, Otter’s platform is not a great choice for document dictation, as it doesn’t integrate well with word-processing tools; it’s better suited for live-event closed captioning.

The Braina Pro tool was popular in the mid-aughts, but its website is outdated, and it hasn’t had any user reviews in years.

The Google Assistant on Gboard interface works only with Gboard-compatible mobile devices, which means it’s useless to desktop users and anyone who doesn’t own an Android or iOS smartphone.

In our tests, Google Docs Voice Typing failed to accurately capture sociolects and casual speech. It also doesn’t work well for people with speech impediments, has poor formatting features, and is nearly impossible to use for anyone who can’t access a mouse and keyboard.

IBM’s Watson Speech to Text is a transcription service that charges by the minute after the first 500 minutes. And the free plan deletes your transcription history after a month of inactivity. We think those shortcomings are enough to disqualify it.

Windows Voice Typing isn’t as robust a tool as Windows Voice Recognition, and we found its accessibility commands to be limiting.

We considered several Chrome-specific apps, including Chromebook Dictation , Speechnotes , and SpeechTexter , but we skipped testing them because of their limited features and usage restrictions that made them inaccessible to most people.

We also considered the following options but quickly learned that they’re designed for specific commercial uses:

  • Amazon Transcribe is built for commercial products.
  • Speechmatics is designed for commercial products, such as live transcription for video conferences, so it’s too expensive and inaccessible for the average person.
  • Suki Assistant is designed for medical dictation.
  • Verbit offers transcription services for businesses.

This article was edited by Ben Keough and Erica Ogg.

Meenakshi Das, disability advocate and software engineer, Microsoft , text interview , September 30, 2022

Sayash Kapoor, PhD candidate, Center for Information Technology Policy, Princeton University , phone interview , October 6, 2022

Christopher Manning, co-director, Stanford Institute for Human-Centered Artificial Intelligence, Stanford University , Zoom interview , October 5, 2022

Diego Mariscal, founder, CEO, and chief disabled officer, 2Gether-International , Zoom interview , October 26, 2022

Steve Dent, Amazon, Apple, Microsoft, Meta and Google to improve speech recognition for people with disabilities , Engadget , October 3, 2022

Su Lin Blodgett, Lisa Green, Brendan O’Connor, Demographic Dialectal Variation in Social Media: A Case Study of African-American English (PDF) , Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing , November 1, 2016

Prabha Kannan, Is It My Turn Yet? Teaching a Voice Assistant When to Speak , Stanford Institute for Human-Centered Artificial Intelligence, Stanford University , October 10, 2022

Allison Koenecke, Andrew Nam, Emily Lake, Sharad Goel, Racial disparities in automated speech recognition , Proceedings of the National Academy of Sciences , March 23, 2020

Speech Recognition for Learning , LD OnLine, “Tech Works” brief from the National Center for Technology Innovation (NCTI) , August 1, 2010

Arvind Narayanan, The Limits Of The Quantitative Approach To Discrimination , James Baldwin Lecture Series, Department of African American Studies, Princeton University , October 11, 2022

Meet your guide

examples of speech to text software

Kaitlyn Wells

Kaitlyn Wells is a senior staff writer who advocates for greater work flexibility by showing you how to work smarter remotely without losing yourself. Previously, she covered pets and style for Wirecutter. She's never met a pet she didn’t like, although she can’t say the same thing about productivity apps. Her first picture book, A Family Looks Like Love , follows a pup who learns that love, rather than how you look, is what makes a family.

Further reading

Our three picks for best label makers, shown side by side.

The Best Label Makers

by Elissa Sanci

A label maker can restore order where chaos reigns and provide context where it’s needed, and the best one is the Brother P-touch Cube Plus .

A person looking at a macbook screen displaying the Temi home page.

The Best Transcription Services

by Signe Brewster

We found that the AI-based Temi is the best transcription service for people who need a readable and affordable transcript for general reference.

Four iPhones placed next to each other, all with their screens displaying a different color in their lock screens, shown in front of a blue background.

Which iPhone Should I Get?

by Roderick Scott

USB-C, and better screens and cameras, make the iPhone 15 easy to recommend, but iPhone 14 owners don’t need to upgrade.

hands typing on a keyboard.

5 Cheap(ish) Things to Help With Carpal Tunnel Syndrome

by Melanie Pinola

The good news is, you don’t have to spend a lot to alleviate this potentially debilitating and common condition.

What are you looking for?

Quick links.

Click here to find all our contact information

Boost your productivity with the latest voice-to-text apps and software

Find the perfect voice-to-text tool to boost your productivity with this comprehensive look at the latest apps and software.

examples of speech to text software

Are you tired of typing? Voice-to-text software and services are here to save the day! With the right tools, you can easily convert your – or anyone else’s – voice into text on both desktop and mobile devices.

Voice-to-text apps and software are used for everything from transcribing meetings and providing accurate records of interviews to logging medical observations and creating YouTube video descriptions for SEO purposes. The possibilities are huge.

Before deciding which speech-to-text tool to choose, it's important to consider your specific needs. Free and budget options may provide the basic features, but if you require more advanced tools, a paid platform may be the better solution. Some programs use machine learning to continually improve accuracy, while others may only be as good as their latest update.

Whether you're a busy professional or someone who would simply prefer dictating over typing, there's a speech-to-text program out there for you. So, here, in no particular order, are some of the best voice-to-text tools currently available.

Full audio and video transcription solutions

Voice-to-text apps for iphone and android.

Alrite is an AI-powered app that provides accurate automated audio transcriptions with a 95% accuracy rate for spelling and punctuation. It differentiates between speakers in the same audio or video and can recognise different accents and languages. Alrite transcriptions can be integrated into videos or presentations, and users have complete control over fine-tuning the transcriptions with caption editing. It is available on popular browsers and has a mobile app for your on-the-go transcription needs. Alrite offers various packages for personal and professional use, making it a valuable tool for anyone who needs efficient audio transcriptions.

Dragon Anywhere

Dragon Anywhere is a cloud-based mobile app that offers full dictation capabilities on Android and iOS devices. The app supports boilerplate text insertion and custom vocabularies, with documents shared across devices via Evernote or cloud services. The app has a slight delay due to cloud processing but offers the same speech recognition as the desktop software. However, users cannot dictate directly into another app, and it requires an internet connection to work. The app is available through a subscription model, and Nuance Communications offers a 7-day free trial for users to test the app. Despite these limitations, the Dragon Anywhere app offers powerful voice recognition of the same quality as its desktop software, making it a valuable tool for dictation on the go.

Otter is a cloud-based voice-to-text app that provides real-time transcription for meetings, interviews and lectures. It offers keyword summaries, a wordcloud feature and 600 minutes of free service with the ability to search, edit, play and organise transcriptions. It assigns different speaker IDs for better understanding. Otter has three payment plans, including Premium, which offers advanced features such as bulk export, syncing with Dropbox, and up to 6,000 minutes of speech-to-text. Its Teams plan offers user management, two-factor authentication, centralised billing and live captioning. Otter is user-friendly, accurate and accessible to individuals and teams with different needs. It provides collaboration tools, making it a powerful app for anyone who needs rich notes during meetings, lectures, or interviews.

Verbit is an AI-powered transcription and captioning service for enterprises and educational institutions. The app uses neural networks and algorithms to reduce background noise, differentiate between speakers and provide contextual accuracy. It offers a live transcription feature with human editors to ensure full accuracy and quick turnaround time. Verbit has multiple pricing plans, including API access and custom models, making it a valuable tool for businesses with unique requirements. Its integration with other systems and automation of workflows make it an efficient and effective tool for teams.

Amazon Transcribe

Amazon Transcribe is a cloud-based speech recognition platform that can convert audio to text with high accuracy. The platform uses deep learning algorithms to add punctuation and formatting to the transcribed text, and can handle low-fi and noisy recordings. It offers livestream and batch processing options, and time stamping for individual words to make searching easy. The platform can identify different speakers and channels, and annotate documents accordingly. Amazon Transcribe provides features for editing and managing transcribed texts, including vocabulary filtering and replacement words. It is aimed primarily at businesses and enterprises, but can also be used by individuals. Overall, Amazon Transcribe is a powerful platform with comprehensive capabilities, making it a top choice for accurate speech-to-text transcription services.

Download our free transcription template

Get started with transcription. Here you will find templates for both detailed transcription and standard transcription . You can use the formats and examples in your own working document.

Microsoft Azure Speech to Text

Microsoft's Azure cloud service offers Azure Speech to Text, an advanced speech recognition feature that creates text from various audio sources. It uses deep neural network models to recognise multiple speakers and can be customised to handle different speech patterns and background noise. Azure Speech to Text provides a free container with a single concurrent request for up to five hours of free audio per month, specialist vocabularies and integration with other Azure services, such as Azure Cognitive Services and Azure Media Services. It is available in the cloud, on-premises, or in edge computing, making it a versatile solution for different uses. Azure Speech to Text is a powerful and customisable speech recognition service that can help businesses and developers create more sophisticated and efficient applications that can analyse and process audio and video content.

IBM Watson Speech to Text

IBM's Watson Speech to Text is a cloud-based solution that uses AI and machine learning for real-time and batch audio conversion to text. It offers customisation options for language, audio frequency and output, as well as speaker labels, timestamps and smart formatting. The solution is easily deployable on-premises or in the cloud and can be integrated with other IBM Watson services such as Natural Language Processing. Watson Speech to Text is also known for its enterprise-level security, ensuring data privacy and security. The solution offers competitive pricing, including a free trial for up to 500 minutes of transcription per month and affordable monthly subscription plans based on usage. IBM's Watson Speech to Text is a customisable and accurate solution for businesses looking to convert audio to text.

Google Gboard

Google Gboard is a free voice-to-text app available for Android mobile devices that offers accurate and speedy transcription capabilities with its speech input option. It also offers a range of additional features, such as swiping for input, voice command image insertion and integration with Google Translate, supporting over 60 languages. Although not a dedicated transcription tool, it offers all the basic transcription functionality needed and works seamlessly with any software on Android devices. Its straightforward user experience and easy integration with other Android applications make it a powerful yet basic voice-to-text app, without any advertisements.

Just Press Record

Just Press Record is a user-friendly mobile app that offers one-tap recording, unlimited recording time and iCloud syncing across devices. It has a powerful transcription service that supports over 30 languages and punctuation command recognition. The app also allows for in-app editing of transcribed files and provides comprehensive file views for organising recordings. Users can share audio and text files to other iOS apps, making it easy to work with transcriptions across multiple applications. Just Press Record is an excellent option for users who require a dedicated dictation app with powerful transcription capabilities and cloud syncing.

Speechnotes

Speechnotes is a user-friendly dictation app that uses Google voice recognition technology and requires no account creation or setup. Users can dictate punctuation marks through voice commands or a built-in punctuation keyboard. The app includes custom keys on the built-in keyboard for adding frequently used text, and automatically capitalises words. Changes to notes are saved to the cloud and users can customise notes with a range of fonts and text sizes. Speechnotes is available as a free download from the Google Play Store, with premium features available as in-app purchases. There is also a browser version of the app for Google Chrome. Overall, Speechnotes is a simple and intuitive dictation app that is ideal for users who need to take quick notes on –the go with easy-to-use features.

Transcribe is an AI-powered dictation app for converting videos and voice memos into text files. The app offers high-quality transcription capabilities with support for over 80 languages and the ability to import files from Dropbox. Users can export raw text to a word processor for editing after transcription. Transcribe is free to download, with a 15-minute free transcription time trial available. The app is only available on iOS. Overall, Transcribe is a versatile tool for users who need to transcribe videos or voice memos, and its free trial option allows users to test the app's capabilities before committing to a purchase.

Voice-to-text software

Rev.ai is a suite of speech-to-text APIs that businesses can use to create downstream applications. Its speech engine has been trained to transcribe content on a variety of topics with a variety of accents across various industries. Rev is one of the most accurate AI transcription services available and it can be used by businesses of any size to maximise the value of content and grow their audience. Rev has trained its speech models on over 5.6 million hours of transcribed data, delivering the most accurate speech recognition engine. Users can scale up to 31 languages to meet a global audience. Rev offers a wide range of services such as human and automated transcription, video captions and subtitles and more.

Rev's documentation is easy to follow and many users report that the API works flawlessly. The process is straightforward, making it useful for every type of user. The tool offers various features like global translate subtitles, live Zoom captions and the ability to transcribe in 31 languages. Rev has been used by some of the biggest names in the game, such as Spotify. To sum up, Rev.ai is a powerful tool for businesses looking to optimise their content and improve accessibility for their audience.

Fireflies.ai 

Fireflies is an AI voice assistant that provides powerful transcription capabilities to help users take notes and complete actions during online meetings. It offers user-friendly software that allows for easy uploading of live meetings or audio files for transcription. Fireflies includes a collaborative feature that lets users add comments or highlight specific parts of calls for their team, and it provides integrations and APIs, a Chrome extension and an intuitive dashboard to facilitate collaboration. The tool also has a meeting bot that can automatically join calls, as well as features such as instant meeting recording and skimming transcripts while listening to audio. Fireflies is ideal for businesses, teams and individuals who want to boost productivity and save time. A free trial version is available and users can upgrade to the paid version for more advanced features.

Dragon Professional

Dragon Professional is a dictation application designed for professionals who prefer to dictate documents, create spreadsheets and browse the web using their voice. With a 99% accuracy rate and a typing speed of 160 words per minute, Dragon Professional's speech recognition capability is out-of-the-box and does not require prior training to adapt to the user's voice. The software includes an intuitive interface, custom word lists and a mobile app that allows users to transcribe audio files. Dragon Professional is available for a one-time fee and is comparable to paid-for subscription transcription services. The software is ideal for professionals and freelancers due to its speed, flexibility and ease of use. Nuance is currently offering 12 months' access to Dragon Anywhere at no extra cost with any purchase of Dragon Home or Dragon Professional Individual.

Speak is an AI transcription service that helps collect audio and video data by building custom recorders, recording in-app or uploading files. It automatically transcribes and identifies important keywords, topics, and sentiment trends to ensure valuable information is not lost. Speak offers features such as custom shareable media repositories, named entity recognition, deep search, APIs and integrations, media management, dashboard reports, and audio capture. It's useful for qualitative, academic, marketing research, digital marketing and other crucial functions of an organisation. Speak can help streamline data collection and analysis, improve collaboration and save time and effort. It's an effective tool for anyone who needs to transcribe, analyse and share audio and video data.

Speechmatics

Speechmatics is an advanced speech-to-text transcription tool that can transcribe audio and video files with high accuracy in real time. It can recognise and transcribe various British accents without extra charges. The software can convert call centre phone recordings into searchable text or Word documents and work with videos and other media for captioning purposes. It also allows keyword triggers for efficient management. Speechmatics offers a flexible and comprehensive speech-to-text service that is cost-effective and competitive compared to other providers. It is suitable for businesses that need to transcribe audio or video content, particularly those with international clientele or employees with diverse accents. With Speechmatics, users can be confident in the accuracy of the transcriptions and the ease of use of the software.

Beey is an automatic voice-to-text software that converts audio and video files into text, with the added ability to create high-quality captions and subtitles for videos. The platform supports more than 20 languages and includes a machine translation tool for multi-lingual content creation. Beey's automatic speech recognition solution is highly accurate and can handle large volumes of content, with manual editing available to correct any errors. The software is intuitive, well designed and fast, making it a useful tool for businesses and individuals who need to transcribe audio and video content quickly and accurately. Beey's support for multiple languages and the ability to create professional-quality captions and subtitles make it ideal for content creators looking to reach a global audience.

Braina Pro is a speech recognition software that doubles as a digital assistant to help users perform tasks on their PC. It supports dictation in almost 90 languages and has customisable commands. Its Android app allows remote control of a PC via a local Wi-Fi network. While there is a free version with limited functionality, the speech recognition function can be tried for seven days before subscribing. However, Braina Pro is only available through a subscription model and Google's Chrome browser needs to be installed for speech recognition to work. Braina Pro is a powerful and versatile tool for users looking for a speech recognition software and virtual assistant combination.

Sonix is an AI-based transcription service for businesses to transcribe and organise video and audio files. The software features fast transcription, with 30 minutes of audio or video transcribed in three to four minutes. Users can review and edit transcripts for accuracy using an online editor that highlights low confidence words. Sonix supports drag and drop functionality, multi-user collaboration and text and audio synchronisation. The software automatically identifies speakers and separates exchanges into different paragraphs. The platform is ideal for industries that require quick and accurate transcription. In summary, Sonix is a powerful and versatile transcription service that offers speed, accuracy and a range of features to ensure efficient and high-quality transcription.

NOVA AI is an online tool that automatically generates captions and subtitles for videos, as well as providing video translation services. The software supports both open and closed captions, which can be hardcoded into the video or downloaded as a separate file. It also allows for manual captioning and supports a range of subtitle formats. In addition, NOVA AI offers basic video editing functionalities, including trimming, cutting and colliding video clips. The platform is easy to use and accessible through any web browser, without the need for installation. NOVA AI is an ideal choice for content creators looking for a fast and efficient solution to create engaging captions for their videos.

Google Docs Voice Typing

Google Docs offers a free built-in speech-to-text software that allows users to work more efficiently without typing. With over 100 voice commands, users can easily make edits and formatting changes. Simply go to Google Docs, click on Tools and select Voice Typing to start. This software is perfect for individuals who want to save time or have difficulty typing and it can recognise a wide range of accents and transcribe up to 120 languages, including English, Spanish, Chinese and Arabic. Overall, the Google speech-to-text software is an excellent tool that can increase productivity and is a must-have for those who rely on voice recognition technology.

NaturalReader

NaturalReader is a versatile text-to-speech software available in both online and downloadable versions, with support for a wide range of text and document formats. It converts text to audio files and allows users to modify the pronunciation of individual words. While a free version is available with limited features, users can upgrade to the paid version for access to tools such as text highlighting and note-taking. NaturalReader is an excellent tool for those who prefer an audio-based approach to reading and need to convert text into audio.

Sobolsoft is a speech-to-text software that provides a simple and efficient way to convert audio files to text. The software allows users to upload multiple audio files and convert them into text files simultaneously. Sobolsoft offers a free version that allows users to convert up to 500 minutes of audio every month. After the installation of the software, users can easily upload their audio files and start the conversion process by clicking on the convert button. Once the transcription process is completed, the text can be edited and saved. It is important to note that only MP3 files can be converted using Sobolsoft. To sum up, Sobolsoft is a user-friendly and effective tool for those who need to convert audio files to text on a regular basis, but it has limited features compared to some of its competitors.

Scribie.com

Scribie is a transcription software that offers AI-powered accuracy and various services such as confidential access and add-ons. The four-step transcription process achieves a 99% accuracy rate and the online editor allows for quick review and changes to transcripts. Add-ons include SRT/VTT files and audio time coding. Users upload files, choose automated or manual transcription services and use the online editor to check and download transcripts. Scribie boasts a low error rate (<1%), fast service and confidentiality, and it has been used by well known business and tech brands, such as Oracle, Google, Airbnb, Stripe and Netflix.

Technology + humans: The ultimate in voice-to-text services

Service providers can offer clients the benefits of cutting-edge voice-to-text software alongside the benefits of experienced linguistic experts. A voice-to-text service provider can leverage the strengths of both technology and humans by using the software to produce a first draft transcription and then having an expert linguist review and edit the document. While voice-to-text software can provide fast and accurate transcriptions, it may not capture nuances in language or cultural references that a linguistic expert can recognise.

This approach can save time and reduce costs while providing businesses with high-quality transcriptions that accurately capture the intended message. Additionally, the use of both resources can ensure that transcriptions are culturally sensitive and appropriate for the intended audience.

By offering the benefits of both voice-to-text software and linguistic expertise, Semantix can meet your diverse and changing needs with bespoke solutions. Contact Semantix now for the very best in transcription services .

Would you like to order a transcription?

Download templates for both detailed transcription and standard transcription. You can use the formats and examples in your own working document.

Related content

Speech to text apps

Speech to Text Software and Apps

Converting speech to text on phone

How to convert a voice recording to text

examples of speech to text software

Transcription examples. Which style works for you?

Nutshell

The 5 Best Speech-to-Text Software Options for Your Business

examples of speech to text software

You may have already used some form of speech-to-text software before without even realizing it. Have you ever written a text to someone by speaking it out loud to your phone instead of typing it out with your fingers? That’s speech-to-text in action.

Of course, your texting app isn’t designed to help you write longer content, so it’s not a great option if you’re looking for some dictation software for your business. That means the search is on. What’s the best speech-to-text software option available to you?

That’s just what we’ll cover on this page, as well as discussing what exactly dictation software is. All in all, we’ll cover the following:

Table of Contents

  • What is dictation software? 

When can you use dictation software?

5 best speech-to-text software options, how to choose the best dictation tool for your business.

Keep reading to learn more!

What is dictation software?

Dictation (or speech-to-text) software is a type of tool that transcribes spoken words.

To use a dictation tool, you open it and start speaking aloud. The tool then records the words you say in text form. That means you can write messages, notes, and even whole articles without ever touching a keyboard.

So, when might dictation tools be useful? There are a few different cases where you might want to have this type of tool.

Firstly, you may just want it for convenience. Maybe you find you write better when you can sound out your thoughts aloud rather than typing them silently in a document. Or maybe you just want to be able to write a page while simultaneously doing something else with your hands.

For many people, though, it comes down to more than just convenience. If you have a hand injury—either a minor one or something more serious like carpal tunnel syndrome—you may not be able to type anything. The same goes for many disabled people. In that case, a speech-to-text tool is a must.

Now let’s get to why you’re here: To find the best speech-to-text tool for you. There are several different options out there, so we’ll list some of the top tools below and give a bit of info about them. Those tools include:

  • Windows Speech
  • Apple Dictation
  • Amazon Transcribe
  • Dragon by Nuance

Here’s a summary of how these tools stack up:

Now keep reading to learn a bit more about the best speech-to-text software!

examples of speech to text software

Price: Free

First on our list of the best dictation software is Gboard . This tool is offered by Google, and it actually comes natively installed on all Android devices (though it’s available for iPhones as well).

In addition to typing out text for you, Gboard can perform searches on Google or Maps and even translate text from other languages. (It supports 916 different languages and dialects, by the way.)

Using it is super simple. When you pull up the keyboard in any app, you just press the microphone button in the top corner, and away you go.

2. Windows Speech

examples of speech to text software

Price: Free (with Windows or Microsoft 365)

Windows Speech —also called Windows 11 Speech Recognition or Voice Typing—is a dictation tool for Windows computers. It comes automatically installed on all Windows devices, so you don’t have to pay to use it.

To use this tool, all you do is press Win + H. That’s it! You should see a little dialogue box open that lets you press a microphone button to start recording. Then you just start speaking, and it will transcribe what you say.

3. Apple Dictation

examples of speech to text software

Price: Free (with Apple device)

Apple Dictation is basically just the Apple version of the previous two tools. It’s available on iPhones, iPads, and Macs. To use it on a mobile device, you just do exactly what you do with Gboard—pull up the keyboard and press the microphone button.

Meanwhile, on Mac, you have to enable it in System Preferences. Then, you can use a keyboard shortcut (the default one is usually to press Fn twice) to start dictating.

4. Amazon Transcribe

examples of speech to text software

Price: Pay-as-you-go

Amazon Transcribe is unquestionably one of the best dictation tools out there. It uses machine learning to enhance its functionality, giving it the ability to do things other tools can’t. For example, it can register multiple speakers and transcribe text separately for each one.

Pricing for Amazon Transcribe is very complicated. It’s a pay-as-you plan, where you pay a certain amount for each minute you use it. The price per minute changes as your total number of minutes increases, and it varies based on things like location and type of batch.

To really understand Transcribe’s pricing, you’ll want to check it out on Amazon’s site .

5. Dragon by Nuance

examples of speech to text software

Price: $200+ for desktop, $15 per month for mobile

Dragon , a tool offered by Nuance, is another of the best dictation software options out there. It, like Amazon Transcribe, uses machine learning to improve over time as it learns to understand your voice better. It can also transcribe prerecorded audio files.

One nice thing about Dragon is that it’s available on both mobile and desktop devices. The mobile version, called Dragon Anywhere, costs $15 per month. For the desktop version, the price depends on which package you get, but the cheapest option is a flat fee of $200.

So, with all the options given above—not to mention all the dictation tools not included on this list—how can you make sure you choose the best speech-to-text tool for your business?

There are a few big considerations you should focus on. Firstly, what can you afford? You may be able to rule out several options based solely on price. Then, think about what features you need. Are you happy as long as the basic dictation feature is there, or are there specialized functionalities you want?

Finally, take a look at some reviews. Maybe a particular tool seems good, but then you read some reviews that reveal it’s rather glitchy and hard to work with. After considering all these things, you should be able to pinpoint the best speech-to-text software option for you.

Learn more about the best digital tools with Nutshell

Now that you know about the best dictation software, you may want to track down some other types of software for your business as well.

One of the best software you can get for your business is a  customer relationship management (CRM) platform like Nutshell. Nutshell has the essential CRM features businesses need to close more deals, like contact management and sales automation, plus more advanced features to accelerate growth.

Interested in seeing what Nutshell can do for you? Just check out our  14-day free trial  today!

Try Nutshell free for 14 days!

NO CREDIT CARD REQUIRED

GET STARTED

examples of speech to text software

Ready to try Nutshell for Free?

examples of speech to text software

Software for Industrial Companies: What’s in Your Tech Stack?

examples of speech to text software

The 30 Best Marketing Resources You Can Find on the Internet

examples of speech to text software

Announcing Nutshell Mail Sync for Google Workspace Users

Identifying key customer segments, how to measure the effectiveness of marketing campaigns, 6 ways to send and track emails in nutshell.

Join 30,000+ other sales and marketing professionals. Subscribe to our Sell to Win newsletter!

examples of speech to text software

The 6 best free speech-to-text apps for creators

examples of speech to text software

What type of content do you primarily create?

Discover the best free speech-to-text apps for seamless transcription! Enhance productivity with accurate and efficient voice recognition.

If you're an online creator who works with video and audio (say, a podcaster or YouTuber), chances are you spend a lot of time or money writing scripts and transcribing your content. Or, you let YouTube automatically caption your videos and hope for the best, often with colorful results .

But it doesn't have to be that way.

You don't have to spend hours manually transcribing or a ton of money for per-minute transcription services. Instead, you can use free speech-to-text software, some of which include artificial intelligence (AI) tools designed for creators , to help you get your words onto the page in minutes.

6 best free speech-to-text apps for creators

  • oTranscribe
  • Apple Dictation
  • Google Docs Voice Typing

What is a speech-to-text app?

A speech-to-text app, or dictation app, is software that lets you record your voice (or upload an audio/video file) and transcribes it into text within the app.

The technology basis of these apps is speech recognition software, which takes a recording and breaks it down into bits it can interpret, converting them into digital text. It's worth noting that speech recognition technology and voice recognition aren't the same; the latter only looks to identify a spoken voice (and often specific voice commands) rather than transcribe what’s being said.

One of the most common use cases for speech-to-text is for transcribing interviews and meetings, which makes them more accessible for those with hearing difficulties and better for SEO purposes.

However, you can also use them for transcribing voiceover videos , vlogs, audio-only podcasts, and more.

How to choose the best free speech-to-text software

In this section, we'll cover a few core features you should look out for when choosing free speech-to-text software for creating content. If the software you're looking at doesn't have these, you'll most likely need to look elsewhere.

Transcription minutes

Of course, you need your speech-to-text app to transcribe. However, not every app or tool will transcribe pre-recorded audio or video and offer 'live' transcription. For apps that do both (and if this feature is what you need), you'll want to pay attention to the amount of transcription you get for free.

On the other hand, if you only want to use speech-to-text for script planning (e.g., voicing your ideas out loud), you may only need a dictation tool that'll put your spoken words into a document. We'll be showing you tools that cater to these different needs in our comparison section below.

Format compatibility and export

If you need software or tools to help you use speech-to-text for transcribing videos and podcasts, you'll need to keep an eye out for import and export format compatibility.

If the software you're considering only accepts .wav audio files, you'll need to convert to that format if your recording is in another. On the other end of the workflow, if you need your transcription to be able to export as a Microsoft Word document, you'll need to make sure your software exports Word docs before you waste your time.

Storage and organization

Whether you're only using a dictation tool or full speech-to-text software, you'll want your words to be easily accessible. Some software (if not all) will have storage limits, so if you record a lot of content, look for one with a generous amount of storage.

You'll also want to consider the organization of your files — granted, this point is entirely subjective and depends on what kind of user interface you like to use. Since we're specifically looking at free options (or software with free plans), it won't hurt to try a few out to see which you like best.

Automatic speaker labels

If you record a podcast or other video content with guests, you'll need to be able to separate who's who in your transcription. You can manually separate speakers in your transcription, but the best way to save time here is to use software that automatically adds speaker labels.

Usually, this means the software will ask you to identify the speakers first; then, it'll handle the rest of the transcription (typically with AI).

An easy-to-use editor

The final feature you want to consider is editing. No transcription software is 100% accurate, so you'll want to use one that has a smooth and easy editor to help you get the job done faster and more easily.

6 best speech-to-text apps for creators

With all of the above in mind, let's get into the details of some of the best speech-to-text software tools currently available that are most suitable for creators.

We make this distinction because some speech-to-text software tools are specifically designed for professional industry use (e.g., medical and legal) and are costly because of that specialization.

1. De‎script

‎ Key features:

  • Automatic high-quality transcription (up to an hour free) with up to 95% accuracy
  • Automatically remove filler words and periods of silence with Descript AI tools
  • Easy document-style editing, which adjusts both the script and media
  • Highlights potential errors to help you proofread and review
  • Easily add subtitles to your video with the transcription
  • Descript supports 23+ different languages 

Upgrade options: The Creator plan (from $12/month) includes 10 transcription hours, and the Pro plan (from $24/month) includes 30 transcription hours. Each comes with even more features besides more hours.

Platforms: Web app, Windows 10 (or newer), Mac OS High Sierra (or newer).

Descript's speech-to-text transcription tool is embedded within its editor software and is one of the best free options specifically for creators. You can create a project for either an existing video to upload or record a new one straight into the software, and the audio-text feature will add the words to your script.

When I added a video of one of my virtual academic conference presentations (originally 12:53 in duration), it transcribed my words in about a minute and a half with suprising accuracy, given that I was using some highbrow academic language.

After editing, using filler words and word gap removal, I cut my video down to 11:29 in just a few seconds and made the video a lot more presentable (unfortunately for me, I didn't have Descript when I initially presented at that conference). 

Descript also lets you use Studio Sound to improve the overall sound quality—it’s free for files up to 10 minutes on the free plan, and unlimited on paid plans.

2. oT‎ranscribe

Key features:

  • A simple HTML web app means good cross-platform accessibility
  • Keyboard shortcuts for easy playback, rewind, and fast-forward
  • Integrated video player to stop tab/software switching
  • Interactive timestamps
  • Automatic saving to your browser's storage every second
  • Export to Markdown, Plain Text, and Google Docs

Upgrade options: Completely free, no plans or upgrade options.

Platforms: Web app (worked in Chrome and Safari at the time of writing).

This one, admittedly, is cheating a little. oTranscribe is technically a transcription-specific tool, so there's no speech-recognition tech involved. But it's a great tool if you want to work on your video or audio manually. For example, suppose you're using a lot of niche vocabulary (fantasy names, industry-specific terms, etc.). In that case, you can sometimes spend more time editing a generated transcript than writing it with better accuracy.

It has a simple HTML interface with a familiar-looking document editor and immediately tells you the most important keyboard shortcuts to use. Using it on the same conference video test made manual transcription much easier than I remember compared to previous projects.

While this is fine for creating a standalone transcript, it doesn't help you add captions or do anything else (e.g., text summaries, repurposing your script, etc.).

3. Di‎ctanote

  • Familiar notebook-style file organization of your notes
  • Basic text editing, which is easy to pick up
  • You can install its dedicated app instead of using the web
  • Decent speech-to-text accuracy
  • Dictation is completely free

Upgrade options: You can pay 10 cents per minute for AI transcription of existing audio files.

Platforms: Web app, Chrome app (when it asked me to install, it installed on my MacBook as a Chrome app).

If you want to use a tool to help you type as you speak, Dictanote is a great option. It's packaged as a note-taking app, where you can easily store and organize notes you've made. You can type notes as usual, but its key feature is its speech-to-text function and voice commands.

If you've never dictated before, it takes some getting used to, i.e., voicing punctuation and new lines. However, once you get the hang of it, speaking your thoughts can be much faster than typing them by hand.

This option is mainly for creators who want their creative ideas out of their heads and onto the page and provide a dedicated space for their ideas.

For the downsides, while testing the app, it didn't seem to like my AirPods when dictating (it didn't register my voice at all, even after granting permissions), and I had to switch to my Macbook Air microphone. That might be down to me not having the correct settings, but it's worth mentioning. Also, not having any free transcription options for existing media can be a deal-breaker for creators who primarily record content on the fly.

4. ‎ Apple Dictation

  • No internet connection required (with Apple Silicon devices)
  • Setting up Voice Control can add even more functionality to dictation
  • User-friendly; use it anywhere you’d usually type
  • Up to 96% accuracy

Upgrade options: Comes free with Apple devices.

Platforms: Apple Mac and iOS devices only.

To test Apple dictation, I've decided to use it to write this section of the article using the Apple Notes app, then copy and paste what I've written into my draft (with a bit of editing).

It's a great tool to help you write as you speak; what’s more, it’s entirely free because it comes embedded within Apple products, including iPhones, iPads, and MacBooks.

Another great benefit of using Apple dictation is that you can easily swap between using your voice and typing, making editing easy for simple mistakes (such as capitalizing brand names). However, when you set it up with voice commands, you can also use dictation to edit instead. Apple dictation also switches off if it doesn’t detect your voice after about 15 seconds or so.

Of course, if you're not an Apple user, Apple dictation is not the tool for you. However, Microsoft has an equivalent dictation tool with an equally reasonable accuracy rate. If you're the type of creator who likes to think out loud and can get used to voicing punctuation and new lines quickly, then Apple dictation is the right tool to help you get thoughts on the page.

As a downside, I found that Apple dictation works best with other Apple software products, such as the Notes app. The dictation keyboard shortcut doesn't work at all in Google Docs, which is likely because Google Docs has its own dictation tool, which we’ll be looking at next.

5. ‎ Google Docs Voice Typing

  • Google Docs is an extremely widely used, cross-platform tool for professionals and creators, making collaboration easy.
  • Activate voice typing with a keyboard shortcut no matter where you are on the page
  • Clear, large icon indicates you've started voice typing

Upgrade options: It comes as a free feature of Google Docs; there's no upgraded version.

Platforms: Web (I'd recommend Chrome specifically for Google Docs, but other browsers may work just as well). It may also work on the Docs app using the Gboard keyboard, but it doesn't work with the default iOS keyboard.

I've used Google Docs as the main deliverable format in my career for years, and I'd never thought to use the native Google speech-to-text feature. However, as a speech-to-text option, it works in the same way as Apple Dictation and Dictanote.

The main difference between these dictation options is the software platform and UI. If you're a creator who uses Google Docs for your ideas, transcripts, collaboration opportunities, and Google Drive for storage, then voice typing directly into Google Docs could be a great option.

However, as with the other dictation tools we've covered, they don't help you with existing media; they’re only for live speech. This lack of transcription can add to your work rather than make your workflow smoother.

6. ‎ Otter.ai

  • AI meeting assistant that keeps audio recordings, transcribes, captures slides, and generates summaries in real time.
  • Automatically integrates with Zoom, Google Meet, and MS Team to write and share notes
  • 300 transcription minutes and up to 30 minutes per conversation on the free plan
  • You can import up to 3 audio or video files for transcription (period). You get a monthly limit if you upgrade.

Upgrade options: Pro from $10/month, Business from $20/month (gets you 1,200 and 6,000 transcription minutes, respectively).

Platforms: Web, iOS app, Android app

My personal experience with Otter.ai started when a client of mine would send me interview transcripts she'd made with it. While they helped create content based on the interviews, the transcripts were never super accurate (I'd say roughly 75%).

However, using my conference presentation video, the accuracy is more within the 90% range. I imagine this huge difference comes from the fact that with more than one person speaking, it can be difficult for the AI to keep speakers separated — and on top of that, neither my client nor the interviewees ever seemed to use dedicated microphones.

For creators who post a lot of videos or audio content online, Otter.ai can be a time saver for transcribing podcast interviews you've recorded on Zoom , Google Meets, or MS Teams.

On the other hand, while you can edit the transcript within the Otter.ai software, you can't edit the media the transcript came from. So, if you need a tool to do both, Otter.ai can't help you. Otter.ai also only works in English, so if you need to use another language, you'll need to look elsewhere.

Honorable mention: Just Press Record

If you're a creator with an iPhone or Apple Watch who finds yourself coming up with content ideas in the most random places, and you typically make voice notes with the Voice Memo mobile app to record your ideas, Just Press Record is a great on-the-go speech-to-text service. It's an honorable mention here because it has a one-time purchase fee from the app store ($/£4.99).

With the iPhone app, you can record pro-level audio (if you've got a plug-in microphone), transcribe every word with high accuracy (no limits), edit the transcript in-app, sync across iCloud, and organize your notes by folder.

However, you can also cut/trim the audio to better match an edited transcript, though you have to do this manually.

Another software often cited as a great choice is Nuance Dragon Professional and Dragon Anywhere mobile app. However, upon researching, I discovered that the app has a lot of poor reviews (it's sitting at 2.4/5 on the app store at the time of writing). So, I decided not to include it in this list.

Quick tip for the best speech-to-text results

No matter which type of speech-to-text tool you use, to get the best results, you'll want to use a good-quality microphone so that the audio is as clear as possible.

If you still have trouble with inaccurate dictation or transcription, try speaking more clearly and making sure you don't have too much background noise.

Best free speech-to-text app FAQs

Is there a free app for voice-to-text transcription.

Yes. There are several free voice-to-text transcription apps available. Descript is one of the best options for creators. However, many people can use their device's onboard dictation solution with a note-taking app.

What is the best AI speech-to-text tool?

Descript is the best transcription option for creators who want to use speech-to-text alongside media editing — editing the transcript also edits the media.

On the other hand, if you don't need to edit media, Otter.ai is another great option for transcribing personal meetings and internal interviews.

What are the benefits of using a speech-to-text app?

  • Saves time. People often speak much faster than they can type, so a speech-to-text tool can help you get words onto a page more quickly.
  • Saves money. Many speech-to-text apps are reasonably accurate and free, which saves you from needing to pay for professional transcriptions (unless you really need human transcription services).

Greater accessibility. People with specific disabilities find it difficult, if not impossible, to type by hand, and so speech-to-text is a critical tool for those who need it.

Related articles

examples of speech to text software

Featured articles:

examples of speech to text software

11 amazing Instagram video editing apps for creators

Discover the top Instagram video editing apps to take your Reels, Stories, and grid posts to the next level.

examples of speech to text software

The 8 best apps for making Reels on Instagram

Discover the best apps for making Instagram Reels in this complete guide!

examples of speech to text software

AI for Creators

8 best AI copywriting tools to save time

Discover the best AI copywriting tools for effortless content creation.

examples of speech to text software

The best ways to remote record a podcast interview, ranked

An experienced audio engineer ranks the best ways to remote record a podcast interview, from lowest to highest quality.

examples of speech to text software

9 AI content creation tools to supercharge your creativity

AI content creation is exploding, but some tools are better than others. Find the best in this guide.

examples of speech to text software

How to write a YouTube script that engages your audience: The ultimate guide

Are you looking to create better narratives in your YouTube videos? Learn how to write a YouTube script that keeps people hooked.

Articles you might find interesting

examples of speech to text software

Ultimate guide to writing a killer fiction podcast

Writing a fiction podcast is similar to writing any type of scripted material — you’ll need a gripping story, compelling characters, and the right software to get it all down.

examples of speech to text software

Using EQ and compression for better video sound

Sound is a crucial component of video, and the two most important tools for editing video sound are compression and EQ.

examples of speech to text software

How to edit short form video to attract the most views

Short-form video follows its own rules. Learn best practices for short-form platforms, plus tips for making the most engaging videos.

examples of speech to text software

What are deepfakes? How to spot fake AI video and audio

What are deepfakes? Check out our guide and learn everything related to deepfakes’ risks and implications.

examples of speech to text software

The best AI tools for podcasters: From scriptwriting to audio editing to marketing your show

You’re not going to love every part of podcasting all the time, and that’s okay. For everything else, there’s AI. We’ve rounded up the AI tools we find most useful.

examples of speech to text software

How They Made It

Ross Sutherland of Imaginary Advice on podcasting's parallels with poetry

We spoke with Ross about how poetry led him to podcasting, how budget constraints led to his signature style, and why pursuing your passion as a full-time job may not be your best option.

examples of speech to text software

Join millions of creators who already have a head start.

Get free recording and editing tips, and resources delivered to your inbox.

Related articles:

Share this article

Best text-to-speech software of 2024

Boosting accessibility and productivity

  • Best overall
  • Best realism
  • Best for developers
  • Best for podcasting
  • How we test

The best text-to-speech software makes it simple and easy to convert text to voice for accessibility or for productivity applications.

Woman on a Mac and using earbuds

1. Best overall 2. Best realism 3. Best for developers 4. Best for podcasting 5. Best for developers 6. FAQs 7. How we test

Finding the best text-to-speech software is key for anyone looking to transform written text into spoken words, whether for accessibility purposes, productivity enhancement, or creative applications like voice-overs in videos. 

Text-to-speech (TTS) technology relies on sophisticated algorithms to model natural language to bring written words to life, making it easier to catch typos or nuances in written content when it's read aloud. So, unlike the best speech-to-text apps and best dictation software , which focus on converting spoken words into text, TTS software specializes in the reverse process: turning text documents into audio. This technology is not only efficient but also comes with a variety of tools and features. For those creating content for platforms like YouTube , the ability to download audio files is a particularly valuable feature of the best text-to-speech software.

While some standard office programs like Microsoft Word and Google Docs offer basic TTS tools, they often lack the comprehensive functionalities found in dedicated TTS software. These basic tools may provide decent accuracy and basic options like different accents and languages, but they fall short in delivering the full spectrum of capabilities available in specialized TTS software.

To help you find the best text-to-speech software for your specific needs, TechRadar Pro has rigorously tested various software options, evaluating them based on user experience, performance, output quality, and pricing. This includes examining the best free text-to-speech software as well, since many free options are perfect for most users. We've brought together our picks below to help you choose the most suitable tool for your specific needs, whether for personal use, professional projects, or accessibility requirements.

The best text-to-speech software of 2024 in full:

Why you can trust TechRadar We spend hours testing every product or service we review, so you can be sure you’re buying the best. Find out more about how we test.

Below you'll find full write-ups for each of the entries on our best text-to-speech software list. We've tested each one extensively, so you can be sure that our recommendations can be trusted.

The best text-to-speech software overall

NaturalReader website screenshot

1. NaturalReader

Our expert review:

Reasons to buy

Reasons to avoid.

If you’re looking for a cloud-based speech synthesis application, you should definitely check out NaturalReader. Aimed more at personal use, the solution allows you to convert written text such as Word and PDF documents, ebooks and web pages into human-like speech.  

Because the software is underpinned by cloud technology, you’re able to access it from wherever you go via a smartphone, tablet or computer. And just like Capti Voice, you can upload documents from cloud storage lockers such as Google Drive, Dropbox and OneDrive.  

Currently, you can access 56 natural-sounding voices in nine different languages, including American English, British English, French, Spanish, German, Swedish, Italian, Portuguese and Dutch. The software supports PDF, TXT, DOC(X), ODT, PNG, JPG, plus non-DRM EPUB files and much more, along with MP3 audio streams. 

There are three different products: online, software, and commercial. Both the online and software products have a free tier.

Read our full NaturalReader review .

  • ^ Back to the top

The best text-to-speech software for realistic voices

Murf website screenshot

Specializing in voice synthesis technology, Murf uses AI to generate realistic voiceovers for a range of uses, from e-learning to corporate presentations. 

Murf comes with a comprehensive suite of AI tools that are easy to use and straightforward to locate and access. There's even a Voice Changer feature that allows you to record something before it is transformed into an AI-generated voice- perfect if you don't think you have the right tone or accent for a piece of audio content but would rather not enlist the help of a voice actor. Other features include Voice Editing, Time Syncing, and a Grammar Assistant.

The solution comes with three pricing plans to choose from: Basic, Pro and Enterprise. The latter of these options may be pricey but some with added collaboration and account management features that larger companies may need access to. The Basic plan starts at around $19 / £17 / AU$28 per month but if you set up a yearly plan that will drop to around $13 / £12 / AU$20 per month. You can also try the service out for free for up to 10 minutes, without downloads.

The best text-to-speech software for developers

Amazon Polly website screenshot

3. Amazon Polly

Alexa isn’t the only artificial intelligence tool created by tech giant Amazon as it also offers an intelligent text-to-speech system called Amazon Polly. Employing advanced deep learning techniques, the software turns text into lifelike speech. Developers can use the software to create speech-enabled products and apps. 

It sports an API that lets you easily integrate speech synthesis capabilities into ebooks, articles and other media. What’s great is that Polly is so easy to use. To get text converted into speech, you just have to send it through the API, and it’ll send an audio stream straight back to your application. 

You can also store audio streams as MP3, Vorbis and PCM file formats, and there’s support for a range of international languages and dialects. These include British English, American English, Australian English, French, German, Italian, Spanish, Dutch, Danish and Russian. 

Polly is available as an API on its own, as well as a feature of the AWS Management Console and command-line interface. In terms of pricing, you’re charged based on the number of text characters you convert into speech. This is charged at approximately $16 per1 million characters , but there is a free tier for the first year.

The best text-to-speech software for podcasting

Play.ht website screenshot

In terms of its library of voice options, it's hard to beat Play.ht as one of the best text-to-speech software tools. With almost 600 AI-generated voices available in over 60 languages, it's likely you'll be able to find a voice to suit your needs. 

Although the platform isn't the easiest to use, there is a detailed video tutorial to help users if they encounter any difficulties. All the usual features are available, including Voice Generation and Audio Analytics. 

In terms of pricing, Play.ht comes with four plans: Personal, Professional, Growth, and Business. These range widely in price, but it depends if you need things like commercial rights and affects the number of words you can generate each month. 

The best text-to-speech software for Mac and iOS

Voice Dream Reader website screenshot

5. Voice Dream Reader

There are also plenty of great text-to-speech applications available for mobile devices, and Voice Dream Reader is an excellent example. It can convert documents, web articles and ebooks into natural-sounding speech. 

The app comes with 186 built-in voices across 30 languages, including English, Arabic, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Japanese and Korean. 

You can get the software to read a list of articles while you drive, work or exercise, and there are auto-scrolling, full-screen and distraction-free modes to help you focus. Voice Dream Reader can be used with cloud solutions like Dropbox, Google Drive, iCloud Drive, Pocket, Instapaper and Evernote. 

The best text-to-speech software: FAQs

What is the best text-to-speech software for youtube.

If you're looking for the best text-to-speech software for YouTube videos or other social media platforms, you need a tool that lets you extract the audio file once your text document has been processed. Thankfully, that's most of them. So, the real trick is to select a TTS app that features a bountiful choice of natural-sounding voices that match the personality of your channel. 

What’s the difference between web TTS services and TTS software?

Web TTS services are hosted on a company or developer website. You’ll only be able to access the service if the service remains available at the whim of a provider or isn’t facing an outage.

TTS software refers to downloadable desktop applications that typically won’t rely on connection to a server, meaning that so long as you preserve the installer, you should be able to use the software long after it stops being provided. 

Do I need a text-to-speech subscription?

Subscriptions are by far the most common pricing model for top text-to-speech software. By offering subscription models for, companies and developers benefit from a more sustainable revenue stream than they do from simply offering a one-time purchase model. Subscription models are also attractive to text-to-speech software providers as they tend to be more effective at defeating piracy.

Free software options are very rarely absolutely free. In some cases, individual voices may be priced and sold individually once the application has been installed or an account has been created on the web service.

How can I incorporate text-to-speech as part of my business tech stack?

Some of the text-to-speech software that we’ve chosen come with business plans, offering features such as additional usage allowances and the ability to have a shared workspace for documents. Other than that, services such as Amazon Polly are available as an API for more direct integration with business workflows.

Small businesses may find consumer-level subscription plans for text-to-speech software to be adequate, but it’s worth mentioning that only business plans usually come with the universal right to use any files or audio created for commercial use.

How to choose the best text-to-speech software

When deciding which text-to-speech software is best for you, it depends on a number of factors and preferences. For example, whether you’re happy to join the ecosystem of big companies like Amazon in exchange for quality assurance, if you prefer realistic voices, and how much budget you’re playing with. It’s worth noting that the paid services we recommend, while reliable, are often subscription services, with software hosted via websites, rather than one-time purchase desktop apps. 

Also, remember that the latest versions of Microsoft Word and Google Docs feature basic text-to-speech as standard, as well as most popular browsers. So, if you have access to that software and all you’re looking for is a quick fix, that may suit your needs well enough. 

How we test the best text-to-speech software

We test for various use cases, including suitability for use with accessibility issues, such as visual impairment, and for multi-tasking. Both of these require easy access and near instantaneous processing. Where possible, we look for integration across the entirety of an operating system , and for fair usage allowances across free and paid subscription models.

At a minimum, we expect an intuitive interface and intuitive software. We like bells and whistles such as realistic voices, but we also appreciate that there is a place for products that simply get the job done. Here, the question that we ask can be as simple as “does this piece of software do what it's expected to do when asked?”

Read more on how we test, rate, and review products on TechRadar .

Get in touch

  • Want to find out about commercial or marketing opportunities? Click here
  • Out of date info, errors, complaints or broken links? Give us a nudge
  • Got a suggestion for a product or service provider? Message us directly
  • You've reached the end of the page. Jump back up to the top ^

Are you a pro? Subscribe to our newsletter

Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!

John Loeffler

John (He/Him) is the Components Editor here at TechRadar and he is also a programmer, gamer, activist, and Brooklyn College alum currently living in Brooklyn, NY. 

Named by the CTA as a CES 2020 Media Trailblazer for his science and technology reporting, John specializes in all areas of computer science, including industry news, hardware reviews, PC gaming, as well as general science writing and the social impact of the tech industry.

You can find him online on Threads @johnloeffler.

Currently playing: Baldur's Gate 3 (just like everyone else).

  • Luke Hughes Staff Writer
  • Steve Clark B2B Editor - Creative & Hardware

Webflow announces acquisition of Intellimize - expanding beyond visual development to become an integrated Website Experience Platform

Square Online review 2024: Top ecommerce platform pros, cons, and features tested

Endless Ocean Luminous review - splendid Switch spelunking

Most Popular

  • 2 My favorite Nintendo Switch accessory, the upscaling mClassic, has received a very handy price cut at Amazon
  • 3 Want an AirTag-style tracker for your Android phone? Anker’s new devices could be bargains
  • 4 2PB SSD storage in your computer? Why not — storage firm debuts tech that can support up to 32 drives, perfect if you want to use 61.44TB Solidigm SSDs or even bigger PCIe 5.0 ones in the future
  • 5 Tesla EVs could get a massive range boost from new battery tech that promises a 373-mile range from a 10-minute charge
  • 2 'A game of chicken': Samsung set to launch new storage chip that could make 100TB SSDs mainstream — 430-layer NAND will leapfrog competition as race for NAND supremacy heats up
  • 3 I switched through all the best music streamers for a month to compare them – here are the 7 biggest things I learned
  • 4 Tiny heat pump that relies on changing ambient temperature could be key to powering IoT devices and sensors without batteries forever — Nanoparticles are critical to the process, posit scientists
  • 5 Turns out the viral 'Air Head' Sora video wasn't purely the work of AI we were led to believe

examples of speech to text software

Table of Contents

Why Use Speech Recognition Software?

  • Dictation vs. Transcription

Why Use Dictation?

Why use transcription.

  • Do You Need Special Recording Equipment?

The Best Transcription Services

The 5 best dictation software options, the best dictation software for writers (to use in 2023).

examples of speech to text software

A lot of Authors give up on their books before they even start writing .

I see it all the time. Authors sit down to write and end up staring at a blank page. They might get a few words down, but they hate what they’ve written, harshly judge themselves, and quit.

Or they get intimidated by the prospect of writing more and give up. They may come back, but if so, it’s with less and less enthusiasm, until they eventually just stop.

In order to break the pattern, you have to get out of your own head. And the best way to do that is to talk it out.

I’m serious. Who ever said that you have to write your book? Why not speak it?

Authors don’t need to be professional writers. You’re publishing a book because you have knowledge to share with the world.

If you’re more comfortable speaking than writing, there’s no shame in dictating your book.

Sure, at some point, you’ll have to put the words on a page and make them readable, of course.

But for your first draft, you can stop focusing on being a perfect writer and instead focus on getting your ideas out in the world.

In this post, I’ll cover why dictation software is such a great tool, the difference between dictation and transcription, and the best options in each category.

When Authors experience writer’s block , it’s not usually because they have bad ideas or because they’re unorganized. The number 1 cause of writer’s block is fear.

So, how do you get rid of that fear?

phone recording voice memo

The easiest solution is to stop staring at the screen and talk instead.

Many Authors can talk clearly and comfortably about their ideas when they aren’t put on the spot. Just think of how easy it is to sit down with colleagues over coffee or how excited you get explaining your work to a friend.

There’s a lot less pressure in those situations. It’s much easier than thinking, “I’m writing something that thousands of people are going to read and judge.”

When that thought is in your head, of course you’re going to freeze.

Your best bet is to ignore all those thoughts and really focus on your reader . Imagine you’re speaking to a specific person—maybe your ideal client or a close friend. What do they want to know? What can you help them with? What tone do you use when you talk to them?

When you keep your attention on the reader you’re trying to serve, it helps quiet your fear and anxiety. And when you speak, rather than write, it can help you keep a relaxed, confident, and personable tone.

Readers relate to Authors’ authentic voices far more than overly-crafted, hyper-intellectual writing styles.

Speaking will also help you finish your first draft faster because it helps you resist the desire to edit as you go.

We always tell Scribe Authors that their first draft should be a “vomit draft.”

You should spew words onto a page without worrying whether they’re good, how they can be better, or whether you’ve said the right thing.

Your vomit draft can be—and possibly will be—absolute garbage.

But that’s okay. As the Author of 4 New York Times bestsellers, I can tell you: first drafts are often garbage. In the end, they still go on to become highly successful books.

It’s a lot easier to edit words that are already on the page than to agonize over every single thing you’re writing.

That’s why speech recognition software is the perfect workaround. When you talk, you don’t have time to agonize. Your ideas can flow without your brain working overtime on grammar, clarity, and all those other things we expect from the written word.

Of course, your spoken words won’t be the same as a book. You’ll have to edit out all the “uh”s and the places you went on tangents. You might even have to overhaul the organization of the sections.

But remember, the goal of a first draft is never perfection. The goal is to have a text you can work with.

What’s the Difference between Dictation & Transcription?

If you know you want to talk out your first draft, you have 2 options:

  • Use dictation software
  • Use a transcription service

1. Dictation Software

With dictation software, you speak, and the software transcribes your words in real-time.

For example, when you give Siri a voice command on your iPhone, the words pop up across the top of the screen. That’s how dictation software works.

Although, I should point out that we aren’t really talking about Apple’s Siri, Amazon’s Alexa, or Microsoft’s Cortana here. Those are AI virtual assistants that use voice recognition software, but they aren’t true dictation apps. In other words, they’re good at transcribing a shopping list, but they won’t help you write a book.

Some dictation software comes as a standalone app you use exclusively for converting speech to text. Other dictation software comes embedded in a word processor, like Apple’s built-in dictation in Pages or Google Docs’ built-in voice tool.

If you’re a fast speaker, most live dictation software won’t be able to keep up with you. You have to speak slowly and clearly for it to work.

For many people, trying to use dictation software slows them down, which can interrupt their train of thought.

2. Transcription Services

In contrast, transcription services convert your words to text after-the-fact. You record yourself talking and send the completed audio files to the service for transcription.

Some transcription services use human transcription, which is exactly what it sounds like: a human listens to your audio and transcribes the content. This kind of transcription is typically slower and more expensive, but it’s also more accurate.

Other transcription services rely on computer transcription. Using artificial intelligence and advanced voice recognition technology, these services can turn around a full transcript in a matter of minutes. You’ll find some mistakes, but unless you have a strong accent or there’s a lot of background noise in the recording, they’re fairly accurate.

Dictation is the way to go if you want to sit in front of your computer and type—but maybe just type a little faster. It’s especially useful for people who want to switch between talking and typing.

It’s probably not your best option if you want to speak your entire first draft. Voice recognition software still requires you to speak slowly and clearly. You might lose your train of thought if you’re constantly stopping to let the software catch up.

With dictation software, you may also be tempted to stop and read what the software is typing. That’s an easy way to get sucked into editing, which you should never do when you’re writing your first draft.

I recommend using dictation as a way to shake up your writing process, not to replace typing entirely.

If you want to get your vomit draft out by speaking at your own natural pace, we recommend making actual recordings and sending them to a transcription service.

Transcription is also preferable if you’re being interviewed or if you have a co-author because it can recognize multiple voices. It’s also a lot more flexible in terms of location. People can interview you over Zoom or in any other conferencing system, and as long as you can record the conversation, it will work.

Transcription is also relatively cheap and works for you while you do other things. You can record your content at your own pace and choose when you want a computer (or person) to transcribe it. You could record your whole book before you send the audio files for transcription, or you could do a chapter at a time.

Transcription may not work well for you if you are a visual person who needs to see text in order to stay on track. Without a clear outline in front of you, sometimes the temptation to verbally wander or jump around can be too great, and you’ll waste a lot of time sorting through the transcripts later.

Do You Need Any Special Recording Equipment?

No. Most people don’t need anything special.

Whether you’re using transcription or dictation, don’t waste your money on fancy audio equipment. The microphone that comes with your computer or smartphone is more than adequate.

Some people find headsets useful because they can move around while they’re speaking. But you don’t want to multitask too much. If you’re trying to dictate your book while you’re cooking, you’ll be distracted, and the ambient noise could mess up the recording.

Scribe recommends 2 transcription services:

Temi works well for automated transcription (i.e., transcribed by a computer, not a human).

They charge $.25 per audio minute, and their turnaround only takes a few minutes.

Their transcripts are easy to read with clear timestamps and labels for different speakers. They also provide an online editing tool that allows you to easily clean up your transcripts. For example, you can easily search for all the “um”s and remove them with the touch of a button.

You can also listen to your audio alongside the transcript, and you can adjust the playback speed. This is very useful if you’re a fast talker.

If you prefer to work on the go, Temi also offers a mobile app.

Rev offers many of the same features as Temi for automated transcripts. They call this option “Rough Draft” transcription, and it also costs $.25 per audio minute. The average turnaround time for a transcript is 5 minutes.

What sets Rev apart is that they also offer human transcription. This service costs $1.25 per minute, and Rev guarantees 99% accuracy. The average turnaround time is 12 hours.

Human transcription is a great option if your audio file has a lot of background noise. It’s also great if you have a strong accent that automatic transcription software has trouble recognizing.

1. Google Docs Voice Typing

This is currently the best voice typing software, by far. It’s driven by Google’s AI software, which applies Google’s deep learning algorithms to accurately recognize speech. It also supports 125 different languages.

One of the best aspects of Voice Typing is that you don’t need to use a specific operating system or install any extra software to use it. You just need the Chrome web browser and a Google account.

It’s also easy to use. Just log into your account and open a Google Doc. Go to “Tools” and select “Voice Typing.”

How to sign up for Google Voice Typing

A microphone icon will pop up on your screen.

Microphone icon pops up on the Voice Typing screen

Click it, and it will turn red. That’s when you can start dictating.

Red mic pops up and you can start dictating in Voice Typing

Click the microphone again to stop the dictation.

Voice Typing is highly accurate, with the typical caveats that you have to speak clearly and at a relatively slow pace.

It’s free, and because it’s embedded in the Docs software, it’s easy to integrate into your pre-existing workflow. The only potential downside is that you need a high-quality internet connection for Voice Typing, so you won’t be able to use it offline.

2. Apple Dictation

Apple Dictation is a voice dictation software that’s built into Apple’s OS/ iOS. It comes preloaded with every Mac, and it works great with Apple software.

If you’re on an iPhone or iPad, you can access Apple dictation by pressing the microphone icon on the keyboard. Many people use this feature to dictate texts, but it also works in Pages for iPhone. It can be a useful option for taking notes or dictating content while you’re away from your desktop.

If you’re on a laptop or desktop, you can enable dictation by going to System Preferences > Keyboard.

Apple system preferences screen

Apple Dictation typically requires an internet connection, but you can select a feature in Settings called “Enhanced Dictation” that allows you to continuously dictate text when you’re offline.

Apple Dictation options (Under Keyboard)

Apple Dictation is great because it’s free, it works well with Apple software across multiple devices, and it generates fairly accurate text.

It’s not quite as high-powered as some “professional” grade dictation programs, but it would work well for most Authors who already own Apple products.

3. Windows Speech Recognition

The current Windows operating system comes with a built-in voice dictation system. You can train the system to recognize your voice, which means that the more you use it, the more accurate it becomes.

Unfortunately, that training can take a long time, so you’ll have to live with some inaccuracies until the system is calibrated.

On Windows 10, you can access dictation by hitting the Windows logo key + H. You can turn the microphone off by typing Windows key + H again or by resuming typing.

Windows Speech Recognition is a good option if you don’t own a Mac or don’t use Google Docs, but overall, I’d still recommend one of the other options.

4. Otter.ai

Otter allows you to “live transcribe” or create real-time streaming transcripts with synced audio, text, and images. You can record conversations on your phone or web browser, or you can import audio files from other services. You can also integrate Otter with Zoom.

Otter is powered by Ambient Voice Intelligence, which means it’s always learning. You can train Otter to recognize specific voices or learn certain terminology. It’s fast, accurate, and user-friendly.

Otter is based on a subscription plan with basic, premium, and team options. I’ll only mention the basic and premium plans since most Authors won’t need the team features.

The free basic plan allows 600 minutes of transcription per month, which should be plenty—but the maximum length of each file is only 40 minutes. You also can’t import audio and video, and you can only export your transcripts as txt files, not pdf or docx files.

The premium plan is $8.33 per user per month, and it grants you access to a whopping 6,000 monthly minutes, with a max speech length of 4 hours. More importantly, you can import recordings from other apps and export your files in multiple formats (which will make your writing process much smoother).

Dragon is one of the most commonly recommended programs for standalone dictation software. It has high-quality voice recognition, but that high quality comes with a hefty price tag. The latest version, Dragon Home 15, costs $150, but it’s not compatible with Apple’s operating system. Mac users have to upgrade to the Professional version ($300).

With all the solid free options available—several of which are better than Dragon—I don’t recommend buying Dragon.

The Scribe Crew

Read this next.

The Power of Alt Text: Enhancing Accessibility in Audiobooks

Book Ghostwriters for Hire: Find the Perfect Writer

How to Use AI When Writing a Book

  • About AssemblyAI

The top free Speech-to-Text APIs, AI Models, and Open Source Engines

This post compares the best free Speech-to-Text APIs and AI models on the market today, including APIs that have a free tier. We’ll also look at several free open-source Speech-to-Text engines and explore why you might choose an API vs. an open-source library, or vice versa.

The top free Speech-to-Text APIs, AI Models, and Open Source Engines

Growth at AssemblyAI

Choosing the best Speech-to-Text API , AI model, or open-source engine to build with can be challenging. You need to compare accuracy, model design, features, support options, documentation, security, and more.

This post examines the best free Speech-to-Text APIs and AI models on the market today, including ones that have a free tier, to help you make an informed decision. We’ll also look at several free open-source Speech-to-Text engines and explore why you might choose an API or AI model vs. an open-source library, or vice versa.

Looking for a powerful speech-to-text API or AI model?

Learn why AssemblyAI is the leading Speech AI partner.

Free Speech-to-Text APIs and AI Models

APIs and AI models are more accurate, easier to integrate, and come with more out-of-the-box features than open-source options. However, large-scale use of APIs and AI models can come with a higher cost than open-source options.

If you’re looking to use an API or AI model for a small project or a trial run, many of today’s Speech-to-Text APIs and AI models have a free tier. This means that the API or model is free for anyone to use up to a certain volume per day, per month, or per year.

Let’s compare three of the most popular Speech-to-Text APIs and AI models with a free tier: AssemblyAI, Google, and AWS Transcribe.

AssemblyAI is an API platform that offers AI models that accurately transcribe and understand speech, and enable users to extract insights from voice data. AssemblyAI offers cutting-edge AI models such as Speaker Diarization , Topic Detection, Entity Detection , Automated Punctuation and Casing , Content Moderation , Sentiment Analysis , Text Summarization , and more. These AI models help users get more out of voice data, with continuous improvements being made to accuracy .

AssemblyAI also offers LeMUR , which enables users to leverage Large Language Models (LLMs) to pull valuable information from their voice data—including answering questions, generating summaries and action items, and more. 

The company offers up to 100 free transcription hours for audio files or video streams, with a concurrency limit of 5, before transitioning to an affordable paid tier.

Its high accuracy and diverse collection of AI models built by AI experts make AssemblyAI a sound option for developers looking for a free Speech-to-Text API. The API also supports virtually every audio and video file format out-of-the-box for easier transcription.

AssemblyAI has expanded the languages it supports to include English, Spanish, French, German, Japanese, Korean, and much more, with additional languages being released monthly. See the full list here .

AssemblyAI’s easy-to-use models also allow for quick set-up and transcription in any programming language. You can copy/paste code examples in your preferred language directly from the AssemblyAI Docs or use the AssemblyAI Python SDK or another one of its ready-to-use integrations .

  • Free to test in the AI playground , plus 100 free hours of asynchronous transcription with an API sign-up
  • Speech-to-Text – $0.37 per hour
  • Real-time Transcription – $0.47 per hour
  • Audio Intelligence – varies, $.01 to $.15 per hour
  • LeMUR – varies
  • Enterprise pricing is also available

See the full pricing list here .

  • High accuracy
  • Breadth of AI models available, built by AI experts
  • Continuous model iteration and improvement
  • Developer-friendly documentation and SDKs
  • Enterprise-grade support and security
  • Models are not open-source

Google Speech-to-Text is a well-known speech transcription API. Google gives users 60 minutes of free transcription, with $300 in free credits for Google Cloud hosting.

Google only supports transcribing files already in a Google Cloud Bucket, so the free credits won’t get you very far. Google also requires you to sign up for a GCP account and project — whether you're using the free tier or paid.

With good accuracy and 125+ languages supported, Google is a decent choice if you’re willing to put in some initial work.

  • 60 minutes of free transcription
  • $300 in free credits for Google Cloud hosting
  • Decent accuracy
  • Multi-language support
  • Only supports transcription of files in a Google Cloud Bucket
  • Difficult to get started
  • Lower accuracy than other similarly-priced APIs
  • AWS Transcribe

AWS Transcribe offers one hour free per month for the first 12 months of use.

Like Google, you must create an AWS account first if you don’t already have one. AWS also has lower accuracy compared to alternative APIs and only supports transcribing files already in an Amazon S3 bucket.

However, if you’re looking for a specific feature, like medical transcription, AWS has some options. Its Transcribe Medical API is a medical-focused ASR option that is available today.

  • One hour free per month for the first 12 months of use
  • Tiered pricing , based on usage, ranges from $0.02400 to $0.00780
  • Integrates into existing AWS ecosystem
  • Medical language transcription
  • Difficult to get started from scratch
  • Only supports transcribing files already in an Amazon S3 bucket

Open-Source Speech Transcription engines

An alternative to APIs and AI models, open-source Speech-to-Text libraries are completely free--with no limits on use. Some developers also see data security as a plus, since your data doesn’t have to be sent to a third party or the cloud.

There is work involved with open-source engines, so you must be comfortable putting in a lot of time and effort to get the results you want, especially if you are trying to use these libraries at scale. Open-source Speech-to-Text engines are typically less accurate than the APIs discussed above.

If you want to go the open-source route, here are some options worth exploring:

DeepSpeech is an open-source embedded Speech-to-Text engine designed to run in real-time on a range of devices, from high-powered GPUs to a Raspberry Pi 4. The DeepSpeech library uses end-to-end model architecture pioneered by Baidu.

DeepSpeech also has decent out-of-the-box accuracy for an open-source option and is easy to fine-tune and train on your own data.

  • Easy to customize
  • Can use it to train your own model
  • Can be used on a wide range of devices
  • Lack of support
  • No model improvement outside of individual custom training
  • Heavy lift to integrate into production-ready applications

Kaldi is a speech recognition toolkit that has been widely popular in the research community for many years.

Like DeepSpeech, Kaldi has good out-of-the-box accuracy and supports the ability to train your own models. It’s also been thoroughly tested—a lot of companies currently use Kaldi in production and have used it for a while—making more developers confident in its application.

  • Can use it to train your own models
  • Active user base
  • Can be complex and expensive to use
  • Uses a command-line interface

Flashlight ASR (formerly Wav2Letter)

Flashlight ASR, formerly Wav2Letter, is Facebook AI Research’s Automatic Speech Recognition (ASR) Toolkit. It is also written in C++ and usesthe ArrayFire tensor library.

Like DeepSpeech, Flashlight ASR is decently accurate for an open-source library and is easy to work with on a small project.

  • Customizable
  • Easier to modify than other open-source options
  • Processing speed
  • Very complex to use
  • No pre-trained libraries available
  • Need to continuously source datasets for training and model updates, which can be difficult and costly
  • SpeechBrain

SpeechBrain is a PyTorch-based transcription toolkit. The platform releases open implementations of popular research works and offers a tight integration with Hugging Face for easy access.

Overall, the platform is well-defined and constantly updated, making it a straightforward tool for training and finetuning.

  • Integration with Pytorch and Hugging Face
  • Pre-trained models are available
  • Supports a variety of tasks
  • Even its pre-trained models take a lot of customization to make them usable
  • Lack of extensive docs makes it not as user-friendly, except for those with extensive experience

Coqui is another deep learning toolkit for Speech-to-Text transcription. Coqui is used in over twenty languages for projects and also offers a variety of essential inference and productionization features.

The platform also releases custom-trained models and has bindings for various programming languages for easier deployment.

  • Generates confidence scores for transcripts
  • Large support comunity
  • No longer updated and maintained by Coqui

Whisper by OpenAI, released in September 2022, is comparable to other current state-of-the-art open-source options.

Whisper can be used either in Python or from the command line and can also be used for multilingual translation.

Whisper has five different models of varying sizes and capabilities, depending on the use case, including v3 released in November 2023 .

However, you’ll need a fairly large computing power and access to an in-house team to maintain, scale, update, and monitor the model to run Whisper at a large scale, making the total cost of ownership higher compared to other options. 

As of March 2023, Whisper is also now available via API . On-demand pricing starts at $0.006/minute.

  • Multilingual transcription
  • Can be used in Python
  • Five models are available, each with different sizes and capabilities
  • Need an in-house research team to maintain and update
  • Costly to run

Which free Speech-to-Text API, AI model, or Open Source engine is right for your project?

The best free Speech-to-Text API, AI model, or open-source engine will depend on our project. Do you want something that is easy-to-use, has high accuracy, and has additional out-of-the-box features? If so, one of these APIs might be right for you:

Alternatively, you might want a completely free option with no data limits—if you don’t mind the extra work it will take to tailor a toolkit to your needs. If so, you might choose one of these open-source libraries:

Whichever you choose, make sure you find a product that can continually meet the needs of your project now and what your project may develop into in the future.

Want to get started with an API?

Get a free API key for AssemblyAI.

Popular posts

AI trends in 2024: Graph Neural Networks

AI trends in 2024: Graph Neural Networks

Marco Ramponi's picture

Developer Educator at AssemblyAI

AI for Universal Audio Understanding: Qwen-Audio Explained

AI for Universal Audio Understanding: Qwen-Audio Explained

Combining Speech Recognition and Diarization in one model

Combining Speech Recognition and Diarization in one model

How DALL-E 2 Actually Works

How DALL-E 2 Actually Works

Ryan O'Connor's picture

IMAGES

  1. Ultimate Guide to Speech to Text Software

    examples of speech to text software

  2. Free text to speech software with natural voices

    examples of speech to text software

  3. 5 Best Speech-to-Text APIs

    examples of speech to text software

  4. Speech to text apps

    examples of speech to text software

  5. Use of Text-to-Speech Software in Various Industries

    examples of speech to text software

  6. 10 Best Text to Speech Software for 2023

    examples of speech to text software

VIDEO

  1. How to Make Text to Speech Videos ||Top 5 Convert Text to Speech with AI

  2. Discover the Best AI Text-to-Speech Software for Your Voiceover

  3. How to Do Text to Speech on CapCut Tutorial Ai

  4. The most realistic text-to-speech software ever 🤖🗣#aitools #websites #texttospeech

  5. Meet the World's Most Powerful Speech-to-Text API: Deepgram Nova

  6. BEST Free Text To Speech AI Software 2023

COMMENTS

  1. The Best Speech-to-Text Apps and Tools for Every Type of User

    Dragon Professional. Dragon is one of the most sophisticated speech-to-text tools. You use it not only to type using your voice but also to operate your computer with voice control. Dragon ...

  2. Best speech-to-text app of 2024

    Voice Notes is a simple app that aims to convert speech to text for making notes. This is refreshing, as it mixes Google's speech recognition technology with a simple note-taking app, so there are ...

  3. The best dictation and speech-to-text software in 2024

    The best dictation software. Apple Dictation for free dictation software on Apple devices. Windows 11 Speech Recognition for free dictation software on Windows. Dragon by Nuance for a customizable dictation app. Google Docs voice typing for dictating in Google Docs. Gboard for a free mobile dictation app.

  4. The 9 Best Speech-to-Text Apps in 2023 (Tried & Tested)

    Descript welcomed me by name (which was a nice coincidence). The main thing you have to know is that it is a standalone software rather than a web service. It is much more than a speech-to-text converter. It's basically a video editing tool. And there's definitely a learning curve. But thankfully, onboarding is extremely funny and engaging.

  5. Choosing the Best: 2024's Top 10 Speech to Text Applications

    IBM Watson speech to text, a cloud-native solution on this list, is a unique AI-powered tool with impressive capabilities. It provides real-time transcription alongside an option for batch conversion of audio files, catering to various languages, audio frequencies, and output preferences. Key features.

  6. The 9 Best Speech-to-Text Software in 2024 (Ranked)

    IBM Speech to Text: IBM Speech to Text offers powerful and customizable transcription that works seamlessly across multiple devices. Speechnotes Pro: Speechnotes Pro is the perfect note-taking companion for students and professionals, allowing you to type, dictate, record, and sync with OneNote. Transcribe: Transcribe provides a well-rounded ...

  7. Best dictation software of 2024

    Best dictation software of 2024: Quick menu. (Image credit: Pixabay) 1. Best overall 2. Best for on the move 3. Best for Microsoft 365 users 4. Best value 5. Best free for Apple users 6. Best for ...

  8. The Best (Free) Speech-to-Text Software for Windows

    It depends on what you're using it for. For seamless, high-accuracy writing that will require little proof-reading, DNS is the best speech-to-text software around. 2. Windows Speech Recognition. If you don't mind proofreading your documents, WSR is a great free speech-recognition software. On the downside, it requires that you use a Windows ...

  9. Speech-to-Text Software and Apps: The Complete Guide

    Select 'Settings' from the Start menu. Click on 'Ease of Access', then click on 'Text to Speech'. Toggle the switch next to 'Let Cortana read text messages, instant messages, and event descriptions back to me' until it turns green. Now you can say what you need to type, and your words will appear on the screen.

  10. The best speech-to-text software for 2022

    Dragon Anywhere. Amazon Transcribe. Braina Pro. Google Docs Voice Typing. The good news is that the best speech-to-text software doesn't have to cost an arm and a leg — or anything at all ...

  11. The Best Speech-to-Text Apps and Tools for Every Type of User

    Dictation software, meanwhile, is a way to use your voice to type in real time. You talk to your computer or mobile device and immediately see the words on the screen. You can add punctuation by saying the name of the punctuation out loud—for example, "period," "comma," or "open quote" and "end quote." Speech-to-text features or apps also ...

  12. The 2 Best Dictation Softwares of 2024

    Speech-recognition software first became increasingly available in the 1980s and 1990s, with the introduction of talking typewriters for those with low vision, commercial speech-recognition ...

  13. The best voice-to-text apps and software

    Sobolsoft. Sobolsoft is a speech-to-text software that provides a simple and efficient way to convert audio files to text. The software allows users to upload multiple audio files and convert them into text files simultaneously. Sobolsoft offers a free version that allows users to convert up to 500 minutes of audio every month.

  14. Speech-to-Text AI: speech recognition and transcription

    Speech-to-Text AI: speech recognition and transcription | Google Cloud. Accurately convert voice to text in over 125 languages and variants using Google AI and an easy-to-use API.

  15. The 5 Best Speech-to-Text Software Options for Your Business

    Dictation (or speech-to-text) software is a type of tool that transcribes spoken words. To use a dictation tool, you open it and start speaking aloud. The tool then records the words you say in text form. ... For example, it can register multiple speakers and transcribe text separately for each one. Pricing for Amazon Transcribe is very ...

  16. 13 Best Text-to-Speech Software of 2024 (Free, Paid & Online)

    Best Text-to-Speech Software for Translation. Notevibes is a wonderful text-to-speech software with a free version and a feature-packed paid version. It offers 201 unique, natural-sounding voices and 18 languages. Users get 500 characters of translation and the ability to customize pronunciation.

  17. 6 Best Speech-to-Text Apps for Seamless Transcriptions

    Instead, you can use free speech-to-text software, some of which include artificial intelligence (AI) tools designed for creators, to help you get your words onto the page in minutes. 6 best free speech-to-text apps for creators. ... For example, suppose you're using a lot of niche vocabulary (fantasy names, industry-specific terms, etc.). In ...

  18. Top 10 Best Speech Recognition Software for 2024

    Talkatoo. Microsoft Custom Recognition Intelligent Service (CRIS) * These are the leading voice recognition software solutions from G2's Winter 2024 Grid® Report. 1. Google Cloud Speech-to-Text. Google Cloud Speech-to-Text turns spoken words into written text. It listens to voice recordings and writes down what it hears.

  19. Dictation (speech-to-text) technology: What it is and how it works

    Dictation is an assistive technology (AT) tool that can help people who struggle with writing. You may hear it referred to as "speech-to-text," "voice-to-text," "voice recognition," or "speech recognition" technology. It allows users to write with their voices, instead of writing by hand or with a keyboard.

  20. 8 Best Speech-to-Text Software & Apps in 2024 (All Devices)

    Speech-to-text software is a program that uses voice recognition technology to translate audio content into text. The software uses computational linguistics to process spoken language, identify it, and produce an accurate transcript of the spoken content. ... For example, you say "comma" to add a comma. ...

  21. Best text-to-speech software of 2024

    FAQs. How we test. The best text-to-speech software makes it simple and easy to convert text to voice for accessibility or for productivity applications. Best text-to-speech software: Quick menu ...

  22. The 5 Best Dictation Software Apps for Writers [Free & Paid]

    The premium plan is $8.33 per user per month, and it grants you access to a whopping 6,000 monthly minutes, with a max speech length of 4 hours. More importantly, you can import recordings from other apps and export your files in multiple formats (which will make your writing process much smoother). 5. Dragon.

  23. The Top Free Speech-to-Text APIs, AI Models, and Open ...

    Choosing the best Speech-to-Text API, AI model, or open-source engine to build with can be challenging.You need to compare accuracy, model design, features, support options, documentation, security, and more. This post examines the best free Speech-to-Text APIs and AI models on the market today, including ones that have a free tier, to help you make an informed decision.