Add sound effects to your videos with this tool

ElevenLabs has released a new tool that allows video creators to quickly and easily add sound effects to their clips. The app analyzes uploaded videos and suggests different sound effects that can be integrated directly into the videos via an interface.

Camb AI Mars5 enables voice cloning in over 140 languages

Camb AI’s Mars5 AI model enables realistic voice cloning in over 140 languages, combining voice cloning and text-to-speech in a single platform. The company claims that Mars5 is particularly good at capturing emotional nuances in speech, making it ideal for applications such as sports commentary and movies.

Stability AI release Stable Audio Open

Stability AI releases “Stable Audio Open,” a new AI model for the free creation of sounds and pieces of music up to 47 seconds in length. However, due to the training material, it is limited to English descriptions and Western music styles.

ElevenLabs Sound Effects creates audio samples

ElevenLabs, a speech synthesis AI startup, unveiled “Sound Effects,” a new product that allows users to create audio samples simply by entering text. Developed in partnership with Shutterstock, the tool is designed to help creative professionals in fields as diverse as film, television, video games, and social media enhance their content with interesting and appropriate …

Read more

Truecaller lets an AI with your voice answer the phone

Calling app Truecaller is introducing a new feature that allows users to create an AI version of their own voice to answer calls and ask the reason for the call, for example. But is it a good idea to use your own voice for this? I think it would be confusing …

OpenAI releases GPT-4o and more

One day before Google’s I/O, OpenAI tried to steal the show from its big competitor. And their demos definitely caused quite a stir. The focus was on their latest AI model GPT-4o, where the “o” stands for “omnimodel”. This is to indicate that this version does not only process text, but also e.g. image and …

Read more

Google’s fireworks of new tools and features

As expected, Google used the keynote at its I/O developer conference to demonstrate its strength in AI. Among other things, the company presented new AI models for a wide range of tasks. Some will run directly on Android devices or can be found in the Chrome browser. Others use Google’s specialized servers. They create text, …

Read more

OpenVoice is an AI for voice cloning

OpenVoice allows users to realistically clone voices in different languages and accents, and even control emotions and speaking styles. The latest version, OpenVoice V2, offers improved audio quality, native support for multiple languages, and is available free for commercial use. Source: Hacker News

AdaKWS claims better speech recognition than OpenAI’s Whisper

The new AI model AdaKWS from speech recognition specialist aiOla claims to be able to convert speech correctly into text, even if it is technical jargon. The model achieves an accuracy of 94.6% – better than OpenAI’s Whisper.