As expected, Google used the keynote at its I/O developer conference to demonstrate its strength in AI. Among other things, the company presented new AI models for a wide range of tasks. Some will run directly on Android devices or can be found in the Chrome browser. Others use Google’s specialized servers. They create text, images, music or even video clips.
In addition, the company’s own AI “Gemini” will be found even more frequently in the company’s offerings in the future.
Here is an overview of some of the key innovations of interest to creative professionals.
New and improved
Video: Veo is an upcoming video AI, a competitor to OpenAI’s Sora. Like Sora, Veo is not yet generally available, and there is no release date. In fact, Google seems to be planning to integrate the AI as a feature in YouTube Shorts and other services.
Images: Imagen3 is the latest version of Google’s image AI, so it’s comparable to Dall-E, MidJourney, or Stable Diffusion. Imagen3 is also not yet generally available.
Music: The Music AI Sandbox can create loops to a prompt, which can then be used for your own track. Again, there is no official date for general availability.
Google Search: AI-powered search results are no longer called “Search Generative Experience” but “AI Overviews”. They are now generally available in the US, but not for all searches. AI Overviews provide summaries that ideally match the search intent. The impact on Google traffic for website owners is not yet known.
Updates to Google Gemini and more
Google’s AI Gemini will soon offer paying users the ability to create personalized variations called “Gems”. However, it appears that the personalization will only relate to how the chatbot behaves. There is no mention of whether the tool can also be given an individual data set as a knowledge base.
Gemini 1.5 Flash is, as the name suggests, optimized for fast responses, similar to Claude 3 Haiku. A special feature: the Flash version can keep track of up to 1 million tokens per chat.
Google’s Chrome browser will soon ship with the Gemini Nano AI model. Developers will then be able to build applications on top of it, as Google does with its “Help me write” feature.
One example of the many AI capabilities Google has demonstrated: Gmail will learn all kinds of new tricks. For example, mail discussions can be summarized – even across the entire archive.
Google presents a larger version of its freely available AI models: Gemma 27B. It will be available in June.
Another member of the Gemma family is PaliGemma, which specializes in visual input.
And then there was …
Project Astra is one of those impressive demos where you don’t know exactly what the final product will look like. Google presented it as an AI assistant that can analyze live camera footage and answer all kinds of questions. In addition to a smartphone app, the demo also featured a pair of glasses – Google Glass: The Next Generation? It is not known when and in what form Project Astra will be released.
My personal conclusion
Google’s tools may not be on the same level as some of its competitors (yet). But it is obvious that the company has invested a lot. In some respects, it is already a leader, for example in the context length of some of its AI models. At the same time, it is clear that Google has something that OpenAI lacks: an established ecosystem of products that hundreds of millions of people already use every day.