OpenAI expands Realtime API with new voices and reduces costs for

OpenAI has updated its Realtime API, currently in beta, with five new expressive voices for speech-to-speech applications and reduced costs for developers by introducing prompt caching. According to OpenAI’s API documentation cited in an article by VentureBeat, the native speech-to-speech feature enables low latency and nuanced output. The company showcased three of the new voices named Ash, Verse, and Ballad in a post on X (formerly Twitter). OpenAI warns that client-side authentication is not yet available in the beta version, and there may be issues with processing real-time audio due to network conditions. The article also mentions OpenAI’s controversial history with AI-powered speech and voices, including limiting access to its Voice Engine platform and pausing the use of a voice similar to actress Scarlett Johansson. With prompt caching, OpenAI plans to lower real-time API prices by offering a 50% discount on cached text inputs and an 80% discount on cached audio inputs.

OpenAI expands Realtime API with new voices and reduces costs for developers

Related posts:

Stay up-to-date: