Reasoning - Page 10 of 11 - ✦ Smart Content Report

OpenAI releases new o1 model and launches premium ChatGPT Pro subscription

February 5, 2025December 5, 2024

OpenAI has officially launched the full version of its o1 reasoning model, moving it out of preview status. The model, previously codenamed Strawberry, offers improved capabilities in coding, mathematics, and image analysis, with the company claiming a 34% reduction in major errors on complex problems compared to its preview version. The technology company is introducing …

AI development faces scaling challenges but shows alternative paths forward

February 13, 2025December 3, 2024

The artificial intelligence industry is grappling with potential limitations in scaling larger language models (LLMs), according to an analysis by Gary Grossman, EVP of technology practice at Edelman. While recent reports suggest that developing more extensive AI models like GPT-5 may face diminishing returns, industry leaders including OpenAI’s Sam Altman and former Google CEO Eric …

Alibaba releases new AI reasoning model to compete with OpenAI o1

February 5, 2025November 29, 2024

Alibaba has released Qwen with Questions (QwQ), a new artificial intelligence reasoning model designed to compete with OpenAI’s o1 system. The model features 32 billion parameters and can process contexts of up to 32,000 tokens. According to Alibaba’s testing, QwQ outperforms OpenAI’s o1-preview on mathematical and scientific reasoning benchmarks AIME and MATH. The company states …

LLaVA-o1 brings structured reasoning to visual language processing

February 5, 2025November 26, 2024

Chinese researchers have developed LLaVA-o1, an open-source vision language model that introduces a four-stage reasoning process for analyzing images and text. As reported by Ben Dickson for VentureBeat, the model breaks down complex tasks into summary, caption, reasoning, and conclusion phases. The system, built on Llama-3.2-11B-Vision-Instruct and trained on 100,000 image-question-answer pairs, employs a novel …

Chinese AI company DeepSeek launches new reasoning model to compete with OpenAI

February 5, 2025November 23, 2024

DeepSeek, a Chinese AI research company backed by hedge fund High-Flyer Capital Management, has released DeepSeek-R1-Lite-Preview, a new AI model designed to rival OpenAI’s o1. The model specializes in reasoning capabilities, allowing it to spend extended time considering questions before providing answers. According to DeepSeek, their model performs comparably to OpenAI’s o1 on established AI …

New AI math benchmark exposes limitations in advanced reasoning

February 5, 2025November 12, 2024

The FrontierMath benchmark, developed by Epoch AI, presents hundreds of challenging math problems that require deep reasoning and creativity to solve. Despite the growing power of AI models like GPT-4o and Gemini 1.5 Pro, they are solving fewer than 2% of these problems, even with extensive support, according to Epoch AI. The benchmark was created …

OpenAI and others exploring new strategies to overcome AI improvement slowdown

February 5, 2025November 12, 2024

OpenAI is reportedly developing new strategies to deal with a slowdown in AI model improvements. According to The Information, OpenAI employees testing the company’s next flagship model, code-named Orion, found less improvement compared to the jump from GPT-3 to GPT-4, suggesting the rate of progress is diminishing. In response, OpenAI has formed a foundations team …

Chain-of-Thought reasoning no panacea for AI shortfalls

February 5, 2025November 5, 2024

The research paper “Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse” investigates the effectiveness of chain-of-thought (CoT) prompting in large language and multimodal models. While CoT has generally improved model performance on various tasks, the authors explore scenarios where it may actually hinder performance, drawing parallels from …

Entropix: New AI technique improves reasoning by detecting uncertainty

February 5, 2025October 26, 2024

Researchers at XJDR have developed a new technique called Entropix that aims to improve reasoning in language models by making smarter decisions when the model is uncertain, according to a recent blog post by Thariq Shihipar. The method uses adaptive sampling based on two metrics, entropy and varentropy, which measure the uncertainty in the model’s …

LLMs don’t reason logically

February 5, 2025October 14, 2024

A new study from Apple reveals that large language models (LLMs) don’t reason logically but rely on pattern recognition. This finding, published by six AI researchers at Apple, challenges the common understanding of LLMs. The researchers discovered that even small changes, such as swapping names, can alter the models’ results by about 10%. Gary Marcus, …