OpenAI releases new o1 model and launches premium ChatGPT Pro subscription

OpenAI has officially launched the full version of its o1 reasoning model, moving it out of preview status. The model, previously codenamed Strawberry, offers improved capabilities in coding, mathematics, and image analysis, with the company claiming a 34% reduction in major errors on complex problems compared to its preview version. The technology company is introducing …

Read more

AI development faces scaling challenges but shows alternative paths forward

The artificial intelligence industry is grappling with potential limitations in scaling larger language models (LLMs), according to an analysis by Gary Grossman, EVP of technology practice at Edelman. While recent reports suggest that developing more extensive AI models like GPT-5 may face diminishing returns, industry leaders including OpenAI’s Sam Altman and former Google CEO Eric …

Read more

Alibaba releases new AI reasoning model to compete with OpenAI o1

Alibaba has released Qwen with Questions (QwQ), a new artificial intelligence reasoning model designed to compete with OpenAI’s o1 system. The model features 32 billion parameters and can process contexts of up to 32,000 tokens. According to Alibaba’s testing, QwQ outperforms OpenAI’s o1-preview on mathematical and scientific reasoning benchmarks AIME and MATH. The company states …

Read more

LLaVA-o1 brings structured reasoning to visual language processing

Chinese researchers have developed LLaVA-o1, an open-source vision language model that introduces a four-stage reasoning process for analyzing images and text. As reported by Ben Dickson for VentureBeat, the model breaks down complex tasks into summary, caption, reasoning, and conclusion phases. The system, built on Llama-3.2-11B-Vision-Instruct and trained on 100,000 image-question-answer pairs, employs a novel …

Read more

Chinese AI company DeepSeek launches new reasoning model to compete with OpenAI

DeepSeek, a Chinese AI research company backed by hedge fund High-Flyer Capital Management, has released DeepSeek-R1-Lite-Preview, a new AI model designed to rival OpenAI’s o1. The model specializes in reasoning capabilities, allowing it to spend extended time considering questions before providing answers. According to DeepSeek, their model performs comparably to OpenAI’s o1 on established AI …

Read more

New AI math benchmark exposes limitations in advanced reasoning

The FrontierMath benchmark, developed by Epoch AI, presents hundreds of challenging math problems that require deep reasoning and creativity to solve. Despite the growing power of AI models like GPT-4o and Gemini 1.5 Pro, they are solving fewer than 2% of these problems, even with extensive support, according to Epoch AI. The benchmark was created …

Read more

OpenAI and others exploring new strategies to overcome AI improvement slowdown

OpenAI is reportedly developing new strategies to deal with a slowdown in AI model improvements. According to The Information, OpenAI employees testing the company’s next flagship model, code-named Orion, found less improvement compared to the jump from GPT-3 to GPT-4, suggesting the rate of progress is diminishing. In response, OpenAI has formed a foundations team …

Read more

Chain-of-Thought reasoning no panacea for AI shortfalls

The research paper “Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse” investigates the effectiveness of chain-of-thought (CoT) prompting in large language and multimodal models. While CoT has generally improved model performance on various tasks, the authors explore scenarios where it may actually hinder performance, drawing parallels from …

Read more

Entropix: New AI technique improves reasoning by detecting uncertainty

Researchers at XJDR have developed a new technique called Entropix that aims to improve reasoning in language models by making smarter decisions when the model is uncertain, according to a recent blog post by Thariq Shihipar. The method uses adaptive sampling based on two metrics, entropy and varentropy, which measure the uncertainty in the model’s …

Read more

LLMs don’t reason logically

A new study from Apple reveals that large language models (LLMs) don’t reason logically but rely on pattern recognition. This finding, published by six AI researchers at Apple, challenges the common understanding of LLMs. The researchers discovered that even small changes, such as swapping names, can alter the models’ results by about 10%. Gary Marcus, …

Read more

×