← Field Notes
EN/ES

An hour of voice, handled in one breath

March 31, 2026via github · @microsoft
AIopen-sourcevoicetranscriptiontools

What it is

Microsoft just open-sourced something called VibeVoice — a set of voice AI models that do two things: turn speech into text (transcription), and turn text into speech (voice synthesis). Both are free to use, and the code is sitting publicly on GitHub.

What caught my attention is the scale. Most transcription tools get nervous around long recordings. They chop your audio into small pieces, process each chunk separately, and hope the joins don't show. VibeVoice handles up to 60 minutes in a single pass — one continuous read. It knows who's speaking, when, and can be taught to recognise specific words you care about.

The voice synthesis side is equally generous: up to 90 minutes of audio with four distinct voices in one go. That's a full podcast episode, generated.

Why it matters for your business

If you've ever paid for a transcription service, auto-subtitles, or a voiceover tool, this is the kind of thing that sits underneath those products. Except now it's free and open.

For anyone building a meeting assistant, an audio archive, a training library with narration, or even just trying to make their content accessible — this is worth keeping an eye on. It's already being used by real products in the community.

Words worth knowing

Transcription (ASR): Automatic speech recognition — software that listens to audio and writes down what it hears.

TTS (Text-to-Speech): The reverse — software that reads written text aloud in a human-sounding voice.

Speaker diarization: A fancy word for 'who said what.' The model labels each speaker separately, so you know it was Maria talking from minute 3 to minute 7, then Carlos.

Open-source: The recipe is public. Anyone can use it, inspect it, or build a product on top of it — no licence fee.

If your business touches audio in any way — recordings, customer calls, content — ask whoever builds things for you whether this could replace something you're currently paying for.

Check it out →

Written by David at AC0.AI. Follow on @ac0hero

Want us to audit your site? Takes 60 seconds →