Amazon debuts Nova Premier AI model for complex reasoning tasks

Amazon has launched Nova Premier, its most advanced AI model to date, capable of processing text, images, and videos. The model is now available through Amazon Bedrock, the company’s AI development platform. As reported by Kyle Wiggers for TechCrunch, Nova Premier excels at complex tasks requiring contextual understanding and multi-step planning. With a context length …

Read more

Microsoft expands Phi language model family with new reasoning capabilities

Microsoft has introduced three new small language models (SLMs) focused on complex reasoning tasks: Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning. These models represent a significant advancement in what small AI models can accomplish, particularly in mathematical reasoning and multi-step problem solving. The flagship Phi-4-reasoning-plus, a 14-billion parameter model, demonstrates performance that rivals much larger AI systems. According …

Read more

Alibaba launches Qwen3 models with competitive AI reasoning capabilities

Alibaba has released Qwen3, a new family of large language models that compete with leading AI systems from OpenAI and Google. The lineup includes two mixture-of-experts (MoE) models and six dense models, with parameters ranging from 0.6 billion to 235 billion. According to benchmarks shared by Alibaba, the flagship Qwen3-235B-A22B model outperforms DeepSeek R1 and …

Read more

Pleias launches small reasoning models optimized for RAG with built-in citations

French AI startup Pleias has released two open-source small reasoning models specifically designed for retrieval-augmented generation (RAG) with native citation support. As reported by Carl Franzen for VentureBeat, the new models—Pleias-RAG-350M and Pleias-RAG-1B—are available under the Apache 2.0 license, allowing commercial use. Despite their small size, the models outperform many larger alternatives on multi-hop reasoning …

Read more

OpenAI’s reasoning models show increased hallucination rates

OpenAI’s new reasoning AI models, o3 and o4-mini, hallucinate more frequently than their predecessors, according to internal testing. Maxwell Zeff from TechCrunch reports that o3 hallucinated in 33% of questions on OpenAI’s PersonQA benchmark, approximately double the rate of previous models. The o4-mini performed even worse, with a 48% hallucination rate. OpenAI acknowledged in its …

Read more

Google introduces Gemini 2.5 Flash with adjustable “thinking” capabilities

Google has released Gemini 2.5 Flash in preview, offering developers unprecedented control over the AI model’s reasoning capabilities. This new version allows users to toggle “thinking” on or off and set specific “thinking budgets” to balance quality, cost, and response time. The pricing structure reveals the cost impact of reasoning: input costs $0.15 per million …

Read more

OpenAI launches o3 and o4-mini with enhanced reasoning and visual capabilities

OpenAI has released two new AI models, o3 and o4-mini, designed to advance reasoning capabilities and introduce novel features like “thinking with images.” These models represent the company’s latest development in its o-series, coming just days after the release of GPT-4.1. The models’ most distinctive feature is their ability to not just recognize images but …

Read more

Google introduces efficient Gemini 2.5 Flash AI model for developers

Google has launched Gemini 2.5 Flash, a new AI model designed for efficiency and strong performance. According to Kyle Wiggers of TechCrunch, the model will soon be available on Google’s Vertex AI development platform. The new model offers “dynamic and controllable” computing that allows developers to adjust processing time based on query complexity. As a …

Read more

Deep Cogito releases new open source AI models with hybrid reasoning capabilities

Deep Cogito, a San Francisco-based AI startup, has emerged from stealth with the release of Cogito v1, a new line of open source large language models featuring hybrid reasoning capabilities. Carl Franzen from VentureBeat reports that the models, fine-tuned from Meta’s Llama 3.2, can either answer immediately or engage in “self-reflection” similar to OpenAI’s “o” …

Read more

Nvidia releases powerful Llama-3.1 Nemotron Ultra language model

Nvidia has launched Llama-3.1-Nemotron-Ultra-253B, a fully open-source language model that outperforms the larger DeepSeek R1 on several benchmarks despite having less than half the parameters. Carl Franzen of VentureBeat reports the model is now available on Hugging Face with open weights and training data. The 253-billion parameter model features a unique toggle for “reasoning on” …

Read more