OpenAI’s reasoning models show increased hallucination rates

OpenAI’s new reasoning AI models, o3 and o4-mini, hallucinate more frequently than their predecessors, according to internal testing. Maxwell Zeff from TechCrunch reports that o3 hallucinated in 33% of questions on OpenAI’s PersonQA benchmark, approximately double the rate of previous models. The o4-mini performed even worse, with a 48% hallucination rate. OpenAI acknowledged in its …

Read more

Dia debuts as open-source text-to-speech model with natural dialogue capabilities

A startup called Nari Labs has released Dia, a new open-source text-to-speech model designed to produce naturalistic dialogue. According to VentureBeat reporter Carl Franzen, the 1.6 billion parameter model rivals offerings from ElevenLabs, OpenAI, and Google’s NotebookLM. Co-creator Toby Kim developed Dia “with zero funding” and Google’s support through access to TPU chips. The model …

Read more

Google’s Gemma 3 models now run on consumer GPUs through quantization

Google has released new versions of its Gemma 3 AI models that can run on consumer-grade graphics cards through a technique called Quantization-Aware Training (QAT). This development makes powerful AI models accessible to users without high-end hardware. The company announced that QAT dramatically reduces memory requirements while maintaining high quality performance. Gemma 3’s largest 27B …

Read more

Kagi Assistant now available to all users with no price increase

Paid search engine Kagi has announced that its AI assistant feature is now available to all users across all subscription plans at no additional cost. According to Kagi’s announcement, the Assistant combines access to leading large language models (LLMs) with optional integration of Kagi Search results. The tool was previously exclusive to Ultimate subscribers but …

Read more

Google introduces Gemini 2.5 Flash with adjustable “thinking” capabilities

Google has released Gemini 2.5 Flash in preview, offering developers unprecedented control over the AI model’s reasoning capabilities. This new version allows users to toggle “thinking” on or off and set specific “thinking budgets” to balance quality, cost, and response time. The pricing structure reveals the cost impact of reasoning: input costs $0.15 per million …

Read more

Guide: GPT-4.1 prompts require more precise instructions

OpenAI has released a comprehensive prompting guide for its new GPT-4.1 family of models, highlighting significant improvements in coding capabilities, instruction following, and long context handling compared to GPT-4o. According to the guide published by OpenAI, developers may need to migrate their prompts because GPT-4.1 follows instructions more literally than previous versions, which tended to …

Read more

OpenAI launches o3 and o4-mini with enhanced reasoning and visual capabilities

OpenAI has released two new AI models, o3 and o4-mini, designed to advance reasoning capabilities and introduce novel features like “thinking with images.” These models represent the company’s latest development in its o-series, coming just days after the release of GPT-4.1. The models’ most distinctive feature is their ability to not just recognize images but …

Read more

OpenAI to discontinue GPT-4.5 API access by mid-July

OpenAI announced plans to phase out GPT-4.5, its largest AI model to date, from its API by July 14. According to reporter Kyle Wiggers from TechCrunch, developers will need to transition to alternative models, with GPT-4.1 being the recommended replacement. An OpenAI spokesperson explained that GPT-4.1 “offers similar or improved performance than GPT-4.5 in key …

Read more

Cohere releases Embed 4 model that handles 200-page documents

Cohere has launched Embed 4, an upgraded multimodal embedding model with a 128,000 token context window that can process documents up to 200 pages long. According to Emilia David at VentureBeat, the new model improves enterprise retrieval augmented generation (RAG) capabilities. Embed 4 handles unstructured data in over 100 languages and is designed to work …

Read more

Claude now integrates with Google Workspace

Anthropic has launched a new integration that allows its AI chatbot Claude to access Gmail, Google Calendar, and Google Docs. Maxwell Zeff reports that the feature is rolling out in beta to subscribers of Anthropic’s premium plans including Max, Team, Enterprise, and Pro. The integration aims to provide more personalized responses without requiring users to …

Read more