Analysis: OpenAI’s o3 shows better performance at higher computing costs

OpenAI’s latest AI model o3 has demonstrated significant performance improvements while requiring unprecedented levels of computing power. According to Maxwell Zeff’s report in TechCrunch, the model achieved remarkable results on benchmark tests, including an 88% score on the ARC-AGI test, far surpassing previous AI models.

The o3 model uses test-time scaling, a new approach that requires more computational resources during the inference phase, with costs reaching over $1,000 per task in some cases. While the model shows promising capabilities, its high operational costs may limit its accessibility to institutional users with substantial budgets.

Noam Brown, co-creator of the o-series, expressed confidence in the continued trajectory of improvement, noting that o3’s release came just three months after o1. Industry experts, including Anthropic co-founder Jack Clark, suggest this development indicates accelerated AI progress for 2025. However, the model’s practical applications may be limited to high-stakes decision-making scenarios rather than everyday use, due to its substantial computing requirements.

Despite its impressive performance, experts like François Chollet emphasize that o3 is not AGI and still struggles with simple tasks that humans can easily complete. The development raises important questions about the future of AI scaling and the role of specialized inference chips in making such advanced models more cost-effective.

Related posts:

Stay up-to-date: