Most dictation tools can hear you just fine. They transcribe words. What they don't do is understand where you're writing.
You dictate the same sentence into an email, a Slack message, and a piece of code, and you get the same flat, punctuation-free block of text in all three. You end up cleaning it up by hand, which defeats the point.
It looks at what's on your screen before it writes.
If you're in an email to a client, it adds punctuation and a professional tone. If you're in Slack, it keeps things short and casual. If your cursor is in a code editor and you dictate a function name, it formats it as code — camelCase and all. Same voice, different output, because the app reads context instead of just sound.
It runs on Mac, Windows, and iPhone. It works across any app without setup. It supports 49 languages, learns your custom vocabulary (brand names, acronyms, client jargon), and keeps transcripts private — they don't sit on their servers.
The underlying model is Avalon, which Aqua Voice describes as Anthropic's transcription model. In practice it means fewer weird transcription errors on names, numbers, and technical terms than the usual dictation tools give you.
Their claim is 5x faster than typing and twice as accurate. Even if the real number is half that, it's still a big shift in how you get ideas out of your head.
If a chunk of your week is writing — emails, briefs, Slack replies, content — speaking instead of typing is one of the few genuine time unlocks left. The reason most people drop dictation after a day is that they spend as much time fixing the output as they saved. Context-aware dictation is what closes that gap.
The free tier gives you 1,000 words a month, which is enough to feel whether it fits how you work. Pro is $8/month for unlimited words. Using this referral link (code DP-FH8V) gets you a bonus on signup.
Dictation — Speaking instead of typing, with software converting your voice into written text. Every phone has it; most are clumsy.
Context-aware — Software that adjusts its behaviour based on what else is happening. Here: what app you're in, what's on screen, who you seem to be writing to.
Transcription model — The AI doing the listening. Better models mishear fewer names, numbers, and technical words.
A question worth sitting with: how much of your writing week is thinking, and how much is just the mechanical act of getting words onto a screen? If the second part is bigger than you'd like, that's the bit a tool like this quietly removes.