OpenAI launches GPT-4.5 model

OpenAI has officially launched GPT-4.5, its newest and largest AI language model to date. The model, previously known internally as “Orion,” is being released as a research preview with claims of enhanced conversational abilities, reduced hallucination rates, and improved emotional intelligence compared to previous models. While OpenAI positions GPT-4.5 as its “largest and best model for chat yet,” the company acknowledges that it is not a frontier model and falls short of its reasoning models on certain benchmarks.

Availability and rollout plan

GPT-4.5 is first available to ChatGPT Pro subscribers, who pay $200 per month for OpenAI’s premium tier. According to OpenAI’s announcement, Plus and Team users will gain access next week, followed by Enterprise and Education users the week after. Developers on all paid API tiers can also access GPT-4.5 through the Chat Completions API, Assistants API, and Batch API.

The model comes with significant cost implications for developers. API pricing is set at $75 per million input tokens (approximately 750,000 words) and $150 per million output tokens. This represents a substantial increase compared to GPT-4o, which costs just $2.50 per million input tokens and $10 per million output tokens. OpenAI has acknowledged that the model is “very large and compute-intensive” and stated they are “evaluating whether to continue serving it in the API long-term.”

Technical approach and capabilities

GPT-4.5 represents OpenAI’s continued investment in scaling unsupervised learning rather than focusing solely on reasoning capabilities. While reasoning models like o1 and o3-mini are designed to think step-by-step before responding, GPT-4.5 focuses on increasing “world model accuracy and intuition” through significantly scaled compute and data, along with architectural improvements.

OpenAI claims this approach has resulted in several improvements:

  • Reduced hallucinations: On OpenAI’s SimpleQA benchmark, GPT-4.5 achieved a hallucination rate of 37.1%, lower than GPT-4o and even the reasoning model o1.
  • Enhanced conversational abilities: Human testers reportedly preferred GPT-4.5 over GPT-4o in comparative evaluations, finding it more natural and attuned to human collaboration.
  • Improved emotional intelligence: OpenAI highlights GPT-4.5’s ability to better understand human intent and respond with greater “EQ,” showing appropriate warmth and intuition in conversations.

The model supports file and image uploads and can use canvas for writing and code work. However, it does not currently support multimodal features like Voice Mode, video, and screen sharing in ChatGPT.

Performance benchmarks and limitations

According to benchmarks published by OpenAI, GPT-4.5 shows mixed performance compared to other models:

  • It achieves 71.4% on GPQA (science), compared to 53.6% for GPT-4o but below o3-mini’s 79.7%
  • On AIME ’24 (math), it scores 36.7%, better than GPT-4o’s 9.3% but significantly below o3-mini’s 87.3%
  • For MMLU (multilingual), it reaches 85.1%, slightly above both GPT-4o (81.5%) and o3-mini (81.1%)
  • On coding benchmarks, it shows modest improvements over GPT-4o but falls short of OpenAI’s reasoning models

OpenAI acknowledges these limitations, noting in a leaked document that “GPT-4.5 is not a frontier model” and “its performance is below that of o1, o3-mini, and deep research on most preparedness evaluations.” The company emphasizes that “academic benchmarks don’t always reflect real-world usefulness” and suggests GPT-4.5 may excel in areas like writing help, communication, learning, coaching, and brainstorming.

Industry context and future direction

GPT-4.5’s release comes amid intense competition in the AI space, with recent model launches from Anthropic (Claude 3.7 Sonnet) and Chinese company DeepSeek (R1). According to reports, OpenAI is positioning GPT-4.5 as its “last non-chain-of-thought model,” with CEO Sam Altman indicating that future models, including the anticipated GPT-5 expected later this year, will integrate reasoning capabilities.

OpenAI researcher Nick Ryder clarified that the company’s goal is to eventually provide users with a more blended experience where they don’t have to explicitly choose which model to use. “Saying this is the last non-reasoning model really means we’re really striving to be in a future where all users are getting routed to the right model,” Ryder stated.

The release of GPT-4.5 also raises questions about the continued viability of OpenAI’s scaling approach. While the company continues to invest in larger models requiring more compute and data, some experts, including former OpenAI chief scientist Ilya Sutskever, have suggested that “pre-training as we know it will unquestionably end” and that the industry has “achieved peak data.”

Despite these challenges, OpenAI sees GPT-4.5 as an important step toward future models that will combine the strengths of both approaches. “We believe reasoning will be a core capability of future models,” the company states, “and that the two approaches to scaling—pre-training and reasoning—will complement each other.”

Introduction video

Sources: OpenAI, TechCrunch, Wired, The Verge, Engadget

Related posts:

Stay up-to-date: