Google has announced significant updates to its generative AI portfolio with the introduction of Veo 3, Imagen 4, and a new filmmaking tool called Flow. Veo 3, the company’s latest video generation model, marks a breakthrough by incorporating audio capabilities, including dialogue between characters and ambient sounds such as traffic noise or birdsong. This feature distinguishes it from competitors like OpenAI’s Sora.
According to Eli Collins, VP at Google DeepMind, Veo 3 “excels from text and image prompting to real-world physics and accurate lip syncing.” The model is immediately available to Ultra subscribers in the United States through the Gemini app and to enterprise users on Vertex AI.
Alongside Veo 3, Google released Imagen 4, its enhanced image generation model that offers improved detail quality and better typography. The company claims it produces sharper images with intricate details and supports various aspect ratios up to 2K resolution.
Google also unveiled Flow, described as an “AI filmmaking tool” that combines the company’s advanced models to help users create cinematic content through natural language prompts.
All content generated by these new tools will include SynthID watermarks, part of Google’s responsible AI approach. The company has also launched SynthID Detector, a verification portal to help identify AI-generated content.