Google’s Gemini gets new eyes and a bigger brain

Google’s latest and most capable AI model, Gemini 3 Pro, has advanced capabilities when it comes to tasks that require visual understanding. In a post on the Google Blog, the company outlined how the model processes and reasons about visual information from various sources. According to Google, the model demonstrates strong performance in several key …

Read more

Mistral 3 brings frontier AI to your pocket with unprecedented openness

Mistral AI has launched Mistral 3, a collection of 10 open-source AI models designed to run on devices ranging from smartphones to enterprise cloud systems. The French startup released all models under the Apache 2.0 license, allowing unrestricted commercial use. The release includes Mistral Large 3, the company’s flagship model, and the Ministral 3 series …

Read more

Qwen3-Omni is an open-source model for text, image, audio, and video

The Chinese technology company Alibaba has released Qwen3-Omni, a new generative AI model that can process a combination of text, images, audio, and video. The model is notable for its “omni-modal” capabilities and its open-source license, positioning it as a direct competitor to proprietary models from U.S. tech companies like OpenAI and Google. According to …

Read more

Meta releases Llama 4 models with mixed reception from AI community

Meta has released its newest generation of artificial intelligence models, Llama 4, introducing three variants with improved capabilities. The weekend release included two immediate offerings – Llama 4 Scout and Llama 4 Maverick – with a third model, Llama 4 Behemoth, still in development. According to Meta, Llama 4 models mark “the beginning of a …

Read more

OpenAI brings image generation to a new level

OpenAI has launched native image generation capabilities directly within ChatGPT, powered by its multimodal model GPT-4o. This new feature, called “Images in ChatGPT,” is now available to users across Plus, Pro, Team, and Free subscription tiers, with Enterprise, Edu, and API access coming soon. Unlike the previous DALL-E 3 image generator, which was a separate …

Read more

Google introduces Gemini 2.5 Pro with built-in reasoning capabilities

Google has launched Gemini 2.5 Pro, describing it as their “most intelligent AI model” to date. The new model represents a significant advancement in Google’s AI capabilities, with a particular focus on reasoning abilities that are now built directly into the system. According to Google’s announcement, Gemini 2.5 models are “thinking models” that can reason …

Read more

Baidu launches ERNIE 4.5 and X1 models at lower costs than competitors

Baidu has released two new AI models, ERNIE 4.5 and ERNIE X1, claiming they outperform competitors like DeepSeek and OpenAI on various benchmarks while offering significantly lower pricing. Carl Franzen, writing for VentureBeat, reports that ERNIE 4.5 is a multimodal language model while X1 focuses on reasoning capabilities. The models are notably cheaper than competitors, …

Read more

Cohere releases Aya Vision, a multilingual vision model with open weights

Cohere’s research division has launched Aya Vision, an open-weight vision model supporting 23 languages. According to Carl Franzen’s report in VentureBeat, the model comes in 8-billion and 32-billion parameter versions and can analyze images, generate text, and translate visual content. Aya Vision outperforms larger models like Llama 90B while requiring fewer computational resources. The model …

Read more

Microsoft introduces efficient Phi-4 for text, image, speech processing

Microsoft has unveiled two new AI models in its Phi series: Phi-4-multimodal with 5.6 billion parameters and Phi-4-mini with 3.8 billion parameters. These small language models (SLMs) deliver exceptional performance while requiring significantly less computing power than larger systems, challenging the notion that bigger AI models are always better. The Phi-4-multimodal model stands out for …

Read more