Alibaba releases new AI reasoning model to compete with OpenAI o1

Alibaba has released Qwen with Questions (QwQ), a new artificial intelligence reasoning model designed to compete with OpenAI’s o1 system. The model features 32 billion parameters and can process contexts of up to 32,000 tokens. According to Alibaba’s testing, QwQ outperforms OpenAI’s o1-preview on mathematical and scientific reasoning benchmarks AIME and MATH. The company states … Read more

LLaVA-o1 brings structured reasoning to visual language processing

Chinese researchers have developed LLaVA-o1, an open-source vision language model that introduces a four-stage reasoning process for analyzing images and text. As reported by Ben Dickson for VentureBeat, the model breaks down complex tasks into summary, caption, reasoning, and conclusion phases. The system, built on Llama-3.2-11B-Vision-Instruct and trained on 100,000 image-question-answer pairs, employs a novel … Read more

Chinese AI company DeepSeek launches new reasoning model to compete with OpenAI

DeepSeek, a Chinese AI research company backed by hedge fund High-Flyer Capital Management, has released DeepSeek-R1-Lite-Preview, a new AI model designed to rival OpenAI’s o1. The model specializes in reasoning capabilities, allowing it to spend extended time considering questions before providing answers. According to DeepSeek, their model performs comparably to OpenAI’s o1 on established AI … Read more

New AI math benchmark exposes limitations in advanced reasoning

The FrontierMath benchmark, developed by Epoch AI, presents hundreds of challenging math problems that require deep reasoning and creativity to solve. Despite the growing power of AI models like GPT-4o and Gemini 1.5 Pro, they are solving fewer than 2% of these problems, even with extensive support, according to Epoch AI. The benchmark was created … Read more

OpenAI and others exploring new strategies to overcome AI improvement slowdown

OpenAI is reportedly developing new strategies to deal with a slowdown in AI model improvements. According to The Information, OpenAI employees testing the company’s next flagship model, code-named Orion, found less improvement compared to the jump from GPT-3 to GPT-4, suggesting the rate of progress is diminishing. In response, OpenAI has formed a foundations team … Read more

Chain-of-Thought reasoning no panacea for AI shortfalls

The research paper “Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse” investigates the effectiveness of chain-of-thought (CoT) prompting in large language and multimodal models. While CoT has generally improved model performance on various tasks, the authors explore scenarios where it may actually hinder performance, drawing parallels from … Read more

Entropix: New AI technique improves reasoning by detecting uncertainty

Researchers at XJDR have developed a new technique called Entropix that aims to improve reasoning in language models by making smarter decisions when the model is uncertain, according to a recent blog post by Thariq Shihipar. The method uses adaptive sampling based on two metrics, entropy and varentropy, which measure the uncertainty in the model’s … Read more

LLMs don’t reason logically

A new study from Apple reveals that large language models (LLMs) don’t reason logically but rely on pattern recognition. This finding, published by six AI researchers at Apple, challenges the common understanding of LLMs. The researchers discovered that even small changes, such as swapping names, can alter the models’ results by about 10%. Gary Marcus, … Read more

DeepMind’s Michelangelo tests reasoning in long context windows

DeepMind has introduced the Michelangelo benchmark to evaluate the long-context reasoning capabilities of large language models (LLMs), Ben Dickson reports for VentureBeat. While LLMs can manage extensive context windows, research indicates they struggle with reasoning over complex data structures. Current benchmarks often focus on retrieval tasks, which do not adequately assess a model’s reasoning abilities. … Read more

Google working on AI with advanced reasoning capabilities

Google is developing AI with reasoning abilities inspired by the human brain, similar to OpenAI’s o1 model. Several teams at the company are making progress on AI systems capable of solving complex problems in fields such as mathematics and programming. This was reported by Julia Love and Rachel Metz for Bloomberg. Researchers are using a … Read more