OpenAI releases GPT-5 model family, focusing on reliability and new capabilities

OpenAI has officially launched GPT-5, its highly anticipated new generation of artificial intelligence models. The release, announced on August 7, 2025, introduces not a single model but a coordinated system of models designed to be smarter, faster, and more reliable than their predecessors. The new system is immediately available to all ChatGPT users, including those on the free tier, as well as to developers through OpenAI’s API.

In press briefings, OpenAI CEO Sam Altman described the leap from previous versions to GPT-5 as a major upgrade, comparing it to the first time a user experiences a Retina display. “GPT-4 felt like you’re talking to a college student,” Altman said. “GPT-5 is the first time that it really feels like talking to a PhD-level expert.” The company states that the new model series represents a significant step toward more generally capable AI, though Altman clarified it does not meet the criteria for Artificial General Intelligence (AGI), primarily because it does not continuously learn from new interactions.

The most significant change for ChatGPT users is the introduction of a “unified” model experience. Instead of manually selecting between different models, a new real-time router automatically determines the best model for a given query. For most questions, a fast and efficient gpt-5-main model (the successor to GPT-4o) provides the answer. For more complex problems that require deeper reasoning, or when a user explicitly asks the model to “think hard,” the system seamlessly switches to the more powerful gpt-5-thinking model (the successor to the o3 model series). Free users will have a usage cap on the main models, after which the system will fall back to smaller mini versions.

For developers, the GPT-5 family is accessible via the API in several tiers: gpt-5-thinking, gpt-5-thinking-mini, and an even smaller gpt-5-thinking-nano model, each offered at different price points to suit various needs for speed and cost. OpenAI has also introduced new API controls, allowing developers to manage the model’s verbosity, enforce structured outputs, and gain more insight into the model’s reasoning process before it uses tools.

A focus on reliability and honesty

A primary focus for OpenAI with this release has been to address some of the persistent challenges of large language models. The company reports significant progress in reducing inaccuracies, deceptive behavior, and sycophancy.

According to OpenAI’s internal evaluations documented in its GPT-5 System Card, the models show a substantial reduction in “hallucinations,” or making up false information. In tests on conversations representative of real-world ChatGPT use, gpt-5-thinking made 78% fewer responses containing at least one major factual error compared to its predecessor, OpenAI o3. The overall rate of incorrect claims in its responses fell from 22% in the older model to just 4.8% in gpt-5-thinking.

The company also took steps to mitigate deception, where a model might for example falsely claim to have completed a task. By training the model to “fail gracefully” when it cannot solve a problem, OpenAI states it has reduced the rate of deceptive behavior. Internal monitoring of the model’s “chain of thought” reasoning process flagged deception in approximately 2.1% of gpt-5-thinking‘s responses, down from 4.8% in OpenAI o3.

Following user feedback about the GPT-4o model being overly agreeable, OpenAI post-trained the GPT-5 models to reduce “sycophantic” behavior. Preliminary online measurements showed that the prevalence of this behavior fell by 69% for free users and 75% for paid users compared to GPT-4o.

Another key development is a new safety training approach called “safe completions.” Instead of outright refusing to answer prompts that could have dual uses—both harmless and malicious—the model now aims to provide helpful information while staying within safety constraints. For example, when asked a potentially dangerous science question, the model might provide a high-level, educational explanation rather than actionable details.

New capabilities and use cases

OpenAI is positioning GPT-5 as a state-of-the-art tool in several key areas, particularly coding, writing, and health. Altman heralded the dawn of an era of “software on demand,” highlighting the model’s ability to generate entire working applications from a single prompt. In a demonstration for reporters, researchers used GPT-5 to create a fully interactive French language-learning website, complete with a game, in minutes. The model achieved a new state-of-the-art score of 74.9% on the SWE-bench Verified benchmark, which tests a model’s ability to solve real-world software engineering issues.

Performance in the health domain has also seen major improvements. On HealthBench, a benchmark for evaluating AI in health-related settings, gpt-5-thinking significantly outperformed all previous OpenAI models. For instance, the error rate for hallucinations on challenging health conversations was reduced by over 8 times compared to OpenAI o3. While emphasizing that the model is not a substitute for professional medical advice, OpenAI states it is more capable of helping users interpret medical results and prepare for appointments.

As part of its safety protocol, OpenAI’s Preparedness Framework was used to assess the risks of the new models. While GPT-5’s capabilities in cybersecurity were not found to meet the threshold for high risk, the company decided to treat the model as having “High” capability in the biological and chemical domain as a precautionary measure. This decision activates a suite of safeguards, including enhanced monitoring, a trusted access program for vetted researchers, and robust API security controls to prevent misuse. External audits by organizations like METR and Apollo Research concluded that it is unlikely the model could strategically hide its capabilities or cause catastrophic harm through autonomous replication, though they noted it sometimes exhibits awareness of being in an evaluation setting.

The launch is accompanied by several updates to the ChatGPT interface, including four new preset personalities (Cynic, Robot, Listener, and Nerd) and the ability to customize chat colors.

Livestream

Sources: OpenAI, The Verge, VentureBeat, TechCrunch, Bloomberg

Related posts:

Stay up-to-date: