Play.ht has released its new speech model, “Play 3.0 mini.” This AI-powered text-to-speech model can speak in over 30 languages and imitate any voice or accent. It offers industry-leading speed and accuracy, according to Play.ht.
Play 3.0 mini was designed specifically for conversational AI applications and is intended to be particularly reliable and cost-efficient. The model achieves an average latency of 143 milliseconds, making it faster than Play 2.0. It supports streaming input from large language models and streaming output of audio. The API offers support for HTTP, WebSockets, and SDKs. Play 3.0 mini has been trained to read alphanumeric sequences such as phone numbers and product IDs naturally.
The model is available in various languages, including English, Japanese, Hindi, Arabic, Spanish, Italian, German, French, and Portuguese. Additional languages are in the testing phase.