ElevenLabs launches Scribe with record 96.7% accuracy for English speech-to-text

ElevenLabs has released Scribe v1, a new speech-to-text model achieving record accuracy rates across 99 languages. According to Carl Franzen of VentureBeat, the model outperforms competitors from Google, OpenAI, and Deepgram with a 96.7% accuracy rate for English. Scribe can distinguish up to 32 different speakers in a single audio file and detect non-verbal elements like laughter and background noise. The company claims particular strength in previously underserved languages including Serbian, Cantonese, and Malayalam. Priced at $0.40 per hour of input audio (with a temporary 50% discount), Scribe targets enterprises needing high-accuracy transcription rather than real-time applications, though a low-latency version is in development. The launch coincides with rival Hume AI’s release of Octave, an emotion-adjustable text-to-speech model positioned as a lower-cost alternative to ElevenLabs’ voice services.

Related posts:

Stay up-to-date: