Audio | Page 5 of 8 | ✦ Smart Content Report

Voice cloning startup PlayAI raises $21M amid safety concerns

February 5, 2025November 26, 2024

PlayAI, a company developing AI-powered voice cloning and text-to-speech technology, has secured $21 million in seed funding led by 500 Startups and Kindred Ventures. As reported by Kyle Wiggers for TechCrunch, the Y Combinator-backed startup offers tools for creating synthetic voices, including a voice cloning feature and automated customer service agents. While the technology allows …

New AI model combines speech recognition with privacy protection

February 5, 2025November 26, 2024

Israeli startup aiOla has released Whisper-NER, an open-source AI model that transcribes audio while automatically masking sensitive information. As reported by Carl Franzen for VentureBeat, the model builds upon OpenAI’s Whisper framework and combines automatic speech recognition with named entity recognition to protect private data during transcription. The tool can identify and obscure sensitive details …

YouTube tests AI feature to restyle songs for Shorts

February 5, 2025November 13, 2024

YouTube is launching a limited test of an AI-powered feature that allows creators to modify licensed songs for their Shorts videos, The Verge reports. The new capability, an extension of YouTube’s Dream Track feature, enables selected creators to generate 30-second soundtracks by altering elements like mood and genre of existing songs through text prompts. The …

OpenAI expands Realtime API with new voices and reduces costs for developers

February 5, 2025October 31, 2024

OpenAI has updated its Realtime API, currently in beta, with five new expressive voices for speech-to-speech applications and reduced costs for developers by introducing prompt caching. According to OpenAI’s API documentation cited in an article by VentureBeat, the native speech-to-speech feature enables low latency and nuanced output. The company showcased three of the new voices …

Amphion: open-source toolkit for audio, music and speech generation

February 5, 2025October 30, 2024

Amphion is an open-source toolkit designed to support research and development in audio, music and speech generation. According to the project’s GitHub site, it offers unique visualizations of classic models and architectures to help junior researchers and engineers better understand them. The toolkit supports various individual generation tasks such as text-to-speech (TTS), singing voice synthesis …

Speech to text: Moonshine is fast and as accurate as OpenAI’s Whisper

February 5, 2025October 30, 2024

Useful, an AI company focused on improving human-machine communication, has open-sourced Moonshine, a new speech-to-text model that aims to significantly reduce the latency of voice interfaces. According to Useful founder Pete Warden, Moonshine returns results 1.7 times faster than OpenAI’s Whisper model while matching or exceeding its accuracy. The model’s variable-length input window allows it …

Amazons AI tool can now create audio ads

February 5, 2025October 16, 2024

Amazon has introduced a generative AI tool that enables brands to create audio ads in addition to images and videos, expanding its advertising offerings at the Amazon unBoxed conference. As AdWeek reports, this new feature allows advertisers to generate ads using minimal product information and is part of a broader suite of tools aimed at …

Transcription AI Gladia secures funding

February 5, 2025October 16, 2024

Gladia, an AI-powered transcription and audio intelligence provider, has secured $16 million in funding. The Paris-based company plans to develop a new real-time transcription and analytics engine with this investment. CEO Jean-Louis Quéguiner told VentureBeat he founded the company out of frustration with existing services’ poor accent recognition. Gladia’s new engine can transcribe over 100 …

Play 3.0 mini is made for conversational AI

February 5, 2025October 16, 2024

Play.ht has released its new speech model, “Play 3.0 mini.” This AI-powered text-to-speech model can speak in over 30 languages and imitate any voice or accent. It offers industry-leading speed and accuracy, according to Play.ht. Play 3.0 mini was designed specifically for conversational AI applications and is intended to be particularly reliable and cost-efficient. The …

Rep.ai creates “digital twins” of sales representatives

February 5, 2025September 23, 2024

AI startup Rep.ai has raised $7.5 million in funding to launch its “digital twin” technology for sales representatives. This was reported by Michael Nuñez for VentureBeat. The company, formerly known as ServiceBell, is developing AI-powered avatars to assist website visitors in real-time video and audio conversations. Rep.ai combines visual and vocal replication with natural language …