Alibaba launches Qwen3 models with competitive AI reasoning capabilities

Alibaba has released Qwen3, a new family of large language models that compete with leading AI systems from OpenAI and Google. The lineup includes two mixture-of-experts (MoE) models and six dense models, with parameters ranging from 0.6 billion to 235 billion. According to benchmarks shared by Alibaba, the flagship Qwen3-235B-A22B model outperforms DeepSeek R1 and …

Read more

Writer’s Palmyra X5 is a cost-efficient AI with large context window

Writer has released Palmyra X5, a new large language model featuring a 1-million-token context window that aims to accelerate AI agent adoption in enterprises. As reported by Michael Nuñez for VentureBeat, the model offers performance comparable to GPT-4.1 at 75% lower cost. Palmyra X5 is priced at $0.60 per million input tokens and $6 per …

Read more

Pleias launches small reasoning models optimized for RAG with built-in citations

French AI startup Pleias has released two open-source small reasoning models specifically designed for retrieval-augmented generation (RAG) with native citation support. As reported by Carl Franzen for VentureBeat, the new models—Pleias-RAG-350M and Pleias-RAG-1B—are available under the Apache 2.0 license, allowing commercial use. Despite their small size, the models outperform many larger alternatives on multi-hop reasoning …

Read more

Switching between AI models proves more complex than expected

Enterprise teams switching between large language models (LLMs) face numerous hidden challenges beyond simply changing API keys. According to an article by Lavanya Gupta, treating model migration as “plug-and-play” often leads to unexpected problems with output quality, costs, and performance. The report explores the complexities of moving between models like GPT-4o, Claude, and Gemini. Key …

Read more

Cohere releases Embed 4 model that handles 200-page documents

Cohere has launched Embed 4, an upgraded multimodal embedding model with a 128,000 token context window that can process documents up to 200 pages long. According to Emilia David at VentureBeat, the new model improves enterprise retrieval augmented generation (RAG) capabilities. Embed 4 handles unstructured data in over 100 languages and is designed to work …

Read more

Claude now integrates with Google Workspace

Anthropic has launched a new integration that allows its AI chatbot Claude to access Gmail, Google Calendar, and Google Docs. Maxwell Zeff reports that the feature is rolling out in beta to subscribers of Anthropic’s premium plans including Max, Team, Enterprise, and Pro. The integration aims to provide more personalized responses without requiring users to …

Read more

Open Deep Search brings open-source reasoning to AI search technology

Researchers from Sentient, the University of Washington, Princeton University, and UC Berkeley have introduced Open Deep Search (ODS), a new open-source framework designed to match the capabilities of proprietary AI search solutions. The system combines reasoning agents with web search tools to enhance the performance of large language models. According to the research team led …

Read more

ChatGPT now connects to companies’ internal knowledge databases

OpenAI has introduced a new feature allowing ChatGPT Team users to connect their internal knowledge databases directly to the platform. According to reporting by Emilia David at VentureBeat, this long-requested capability is currently in beta. The feature enables semantic searches of company data, links to internal sources in responses, and helps ChatGPT understand company-specific terminology. …

Read more

OpenAI unveils new developer tools for building AI agents

OpenAI has released a new suite of tools designed to help developers build AI agents similar to the company’s own Deep Research and Operator. The new offerings include the Responses API and the open-source Agents SDK, which provide developers with the building blocks to create AI applications that can search the web, analyze files, and …

Read more

New AI techniques promise huge cost savings and improved performance for enterprises

Recent research has unveiled two promising approaches that could dramatically reduce the costs of running large language models (LLMs) while simultaneously improving their performance on complex reasoning tasks. These innovations come at a critical time as enterprises increasingly deploy AI solutions but struggle with computational expenses. Chain of draft: Less is more Researchers at Zoom …

Read more