GPT-4.5: A different kind of intelligence with high costs and mixed reception

OpenAI’s latest language model, GPT-4.5, has generated significant discussion in the AI community since its release. While it represents OpenAI’s largest and most knowledgeable model to date, its practical value remains contested among experts and users.

A costly advancement

GPT-4.5 comes with a steep price tag: approximately 10 to 20 times more expensive than Claude 3.7 Sonnet and 15 to 30 times costlier than GPT-4o. At launch, pricing was set at $75 per million input tokens and $150 per million output tokens. This high cost has prompted OpenAI to suggest they may not maintain the model in their API long-term unless users identify truly novel use cases.

Distinctive strengths

Despite pricing concerns, GPT-4.5 demonstrates several notable improvements:

  • Enhanced world knowledge and reduced hallucinations
  • Better performance on factual benchmarks like SimpleQA and PersonQA
  • Improved document processing capabilities
  • More natural writing style with better context awareness
  • Superior performance in extracting information from unstructured data

Box, which integrated GPT-4.5 into its Box AI Studio, found it “particularly potent for enterprise use-cases, where accuracy and integrity are mission critical.” Their testing showed GPT-4.5 outperforming the original GPT-4 by about 4 percentage points on enterprise document question-answering tasks.

Mixed reception

Reactions to GPT-4.5 have been polarized. Some experts, like Tyler Cowen, are enthusiastic about the model’s improved aesthetics and writing capability. Others question whether its advantages justify its cost when compared to alternatives like Claude 3.7 or GPT-4o.

Andrej Karpathy, AI scientist and OpenAI co-founder, noted that GPT-4.5 represents an improvement in tasks that are “more EQ-related and bottlenecked by world knowledge, creativity, analogy making, general understanding, humor, etc.” However, he also acknowledged that evaluation remains challenging since conventional benchmarks don’t fully capture these improvements.

Not a reasoning model

OpenAI has emphasized that GPT-4.5 is not designed as a reasoning model. Sam Altman, OpenAI CEO, described it as “a different kind of intelligence” that “isn’t a reasoning model and won’t crush benchmarks.” The model was trained with pretraining, supervised finetuning, and reinforcement learning from human feedback, but lacks the specialized training needed for advanced reasoning in domains like mathematics or coding.

This positions GPT-4.5 as complementary to, rather than competitive with, models like o1 and o3 that excel at reasoning tasks.

Sources: VentureBeat, Don’t Worry About the Vase, Interconnects

Related posts:

Stay up-to-date: