Cohere launches an open source voice model specifically for transcription

Read the full articleCohere launches an open source voice model specifically for transcription on TechCrunch

What Happened

Relatively light at just 2 billion parameters, the model is meant for use with consumer-grade GPUs for those who want to self-host it. It currently supports 14 languages.

Our Take

Cohere's doing the smart thing: build models small and lightweight enough to self-host on consumer GPUs. 2B parameters is the new sweet spot. No API dependency, no per-request costs, just inference on your hardware.

This is how you compete when you're not OpenAI with unlimited compute. Transcription doesn't need GPT-level intelligence—it needs reliability and ownership. Open-source deployable beats proprietary API every time if the model's good enough.

14 language support out of the gate shows they actually thought about the use case.

What To Do

If you're transcription-heavy, self-hosted now beats the API option.

Builder's Brief

Who

teams running self-hosted transcription pipelines, especially in regulated industries

What changes

adds a credible enterprise-backed open alternative to Whisper for self-hosted deployments

When

weeks

Watch for

independent WER benchmark comparisons against Whisper large-v3 on standard test sets

What Skeptics Say

A 2B parameter open-source transcription model entering a market where Whisper is already free, battle-tested, and deeply integrated needs a concrete accuracy or latency advantage to displace it — language count alone will not move adoption.

Cited By

TechCrunch Cohere launches an open source voice model specifically for transcription