Gemini for content professionals: Google’s AI ecosystem explained

Gemini is the label for Google’s AI offerings. The company has developed a dazzling array of tools and services under it. In combination, they are arguably the best AI has to offer today. Even considered on their individual merits, they are often best in class. But Google’s Gemini universe is so vast, that it is …

Read more

Google’s new voice AI lets you direct speech like a film director

Google has released Gemini 3.1 Flash TTS, a new text-to-speech model that the company describes as its most natural and expressive to date. The model is available in preview through the Gemini API, Google AI Studio, Vertex AI for enterprise users, and Google Vids for Workspace users. The model supports more than 70 languages and …

Read more

Google’s Lyria 3 Pro brings structure to AI-generated music

Google has expanded its AI music generation capabilities with the launch of Lyria 3 Pro, a model that creates tracks up to three minutes long. Myriam Hamed Torres writes for Google DeepMind that the model understands musical structure, allowing users to prompt for specific elements such as intros, verses, choruses, and bridges. Lyria 3 Pro …

Read more

Google launches Gemini 3.1 Flash Live voice model

Google has released Gemini 3.1 Flash Live, its latest real-time voice and audio model. Valeria Wu and Yifan Ding write in the Google Blog that the model offers faster responses and improved natural conversation compared to its predecessor. The model is available in several Google products. Developers can access it via the Gemini Live API …

Read more

Mistral releases open-weight text-to-speech model Voxtral TTS

French AI company Mistral has released Voxtral TTS, an open-weight text-to-speech model aimed at enterprise use cases such as customer support, sales, and real-time translation. Unlike competitors such as ElevenLabs, Deepgram, and OpenAI, Mistral is releasing the full model weights, allowing companies to run the system on their own infrastructure without sending data to a …

Read more

Google brings music generation to the Gemini app

Google has added music generation to its Gemini app, allowing users to create 30-second tracks from text prompts or even images. Joël Yawili and Myriam Hamed Torres write in the Google Blog that the feature is powered by Lyria 3, Google DeepMind’s latest generative music model, and is currently available in beta. Users can describe …

Read more

Mistral releases Voxtral Transcribe 2: transcribe on your phone for pennies

Mistral AI has released Voxtral Transcribe 2, a family of speech-to-text models designed for both batch processing and real-time transcription. The company positions the technology as more accurate and significantly cheaper than competing services while enabling on-device processing for sensitive data. The release includes two models. Voxtral Mini Transcribe V2 handles pre-recorded audio files at …

Read more

ElevenLabs releases album with Liza Minnelli and Art Garfunkel created using AI

ElevenLabs has launched “The Eleven Album,” featuring original music from established artists including Liza Minnelli and Art Garfunkel, created in collaboration with the company’s AI music generation system. The project marks one of the first large scale partnerships between artificial intelligence technology and professional musicians, Todd Spangler reports for Variety. The album features tracks across …

Read more

Adobe turns Acrobat into an AI productivity studio

Adobe has expanded Acrobat with new AI features that allow users to generate presentations and podcasts from documents, edit PDFs through conversational prompts, and collaborate more effectively. These capabilities are bundled in Acrobat Studio, which combines PDF tools from Acrobat with content creation features from Adobe Express. The company now enables users to create presentations …

Read more

The magic wand for sound arrives with Meta’s latest AI model

Meta Platforms is introducing a new artificial intelligence model called SAM Audio that simplifies sound editing through simple prompts. This tool allows users to isolate or remove specific sounds from complex recordings with ease. Mike Wheatley reports for Silicon Angle that the model is now available through the Segment Anything Playground. The technology functions similarly …

Read more

×