Dia debuts as open-source text-to-speech model with natural dialogue

A startup called Nari Labs has released Dia, a new open-source text-to-speech model designed to produce naturalistic dialogue. According to VentureBeat reporter Carl Franzen, the 1.6 billion parameter model rivals offerings from ElevenLabs, OpenAI, and Google’s NotebookLM. Co-creator Toby Kim developed Dia “with zero funding” and Google’s support through access to TPU chips. The model supports advanced features like emotional tone, speaker tagging, and nonverbal audio cues from plain text. Users can mark different speakers and include nonverbal behaviors such as laughs and coughs. Dia is available under the Apache 2.0 license, allowing commercial use. Side-by-side comparisons show Dia outperforming competitors in natural timing, emotional range, and handling nonverbal cues. The two-person team behind Nari Labs invites community contributions through Discord and GitHub.

Dia debuts as open-source text-to-speech model with natural dialogue capabilities

Related posts:

Stay up to date

Related posts: