OpenAI releases its first open-weight models since GPT-2

OpenAI has announced the release of two new open-weight language models, gpt-oss-120b and gpt-oss-20b. This marks the company’s first open-weight model release in over five years, since GPT-2 in 2019, signaling a significant strategic shift for the organization that has recently focused on proprietary systems like GPT-4o and ChatGPT. The models, their weights, and a new tokenizer are available for download on the platform Hugging Face under the permissive Apache 2.0 license, which allows for free commercial use, modification, and redistribution.

In a statement, OpenAI CEO Sam Altman positioned the release as a way to “get AI into the hands of the most people possible” and to build “on an open AI stack created in the United States, based on democratic values.” Company president Greg Brockman described the new models as “complementary” to OpenAI’s paid API services, noting that open-weight models offer a different set of strengths, such as the ability to run locally without an internet connection and behind a company’s firewall for enhanced privacy and security.

Model capabilities and performance

The two new models are designed for different use cases and hardware capabilities.

gpt-oss-120b is a 117-billion parameter model that, according to OpenAI, can run efficiently on a single 80 GB GPU. The company claims its performance on core reasoning benchmarks is close to its proprietary o4-mini model, even outperforming it in specific areas like health-related queries (HealthBench) and competition mathematics (AIME).
gpt-oss-20b is a smaller 21-billion parameter model designed for consumer hardware and edge devices, requiring only 16 GB of memory. OpenAI states its performance is comparable to its o3-mini model, making it suitable for on-device applications and rapid local development.

Both models are text-only and built on a Mixture-of-Experts (MoE) architecture, which reduces the computational cost by only activating a fraction of the model’s total parameters for any given task. They support a context length of up to 128,000 tokens and are optimized for reasoning, instruction following, and tool use, such as web browsing or executing Python code. The models utilize a “Chain-of-Thought” (CoT) reasoning process, where they outline their thinking steps before providing a final answer. Developers can also adjust the model’s “reasoning effort” between low, medium, and high settings to balance performance against latency.

A focus on safety and transparency

OpenAI emphasized the extensive safety measures undertaken before the release, which was reportedly delayed for additional testing. In addition to standard safety training, the company conducted a novel evaluation by intentionally fine-tuning a version of gpt-oss-120b for malicious purposes, simulating how a bad actor might try to misuse it for creating biological or cybersecurity threats. According to a research paper released by the company, these “maliciously fine-tuned” models did not reach what OpenAI defines as high-risk capability levels under its Preparedness Framework. This methodology was reviewed by external experts.

A key aspect of the models’ design is the deliberate lack of direct supervision on their Chain-of-Thought (CoT) process. OpenAI states this approach makes it easier to monitor the model for misbehavior or deception, as the model’s internal reasoning is not artificially polished. However, the company explicitly warns developers not to show these CoT outputs to end-users, as they may contain hallucinations, harmful content, or information the model was instructed to exclude from the final answer.

To further encourage community involvement in safety research, OpenAI has launched a $500,000 Red Teaming Challenge, inviting the public to identify and report potential vulnerabilities in the new models.

Strategic motivations and market context

The release positions OpenAI to compete directly in the rapidly growing open-weight AI market, which has seen strong contenders emerge from companies like Meta (Llama series), Mistral in Europe, and several Chinese firms such as DeepSeek and Alibaba. The choice of the highly permissive Apache 2.0 license is notable, as it contrasts with the more restrictive licenses of some competitors, such as Meta’s Llama license, which requires a separate commercial agreement for companies with over 700 million monthly users. This makes the gpt-oss models particularly attractive for enterprises and developers in highly regulated or privacy-sensitive industries like finance and healthcare, as they can be run entirely on-premise.

Analysts suggest the move is a response to enterprise customers who were already using a mix of OpenAI’s proprietary API and open-source models from other providers. By offering powerful open-weight models of its own, OpenAI can keep more developers within its ecosystem. The release also serves to counter criticism from figures like Elon Musk, who have accused the company of abandoning its original open-source mission.

To ensure broad accessibility, OpenAI has partnered with a wide range of deployment platforms, including Microsoft Azure, AWS, Hugging Face, and Vercel, as well as hardware manufacturers like NVIDIA, AMD, and Groq. Microsoft is also releasing GPU-optimized versions of gpt-oss-20b for local inference on Windows devices.

While the new models are free to use, OpenAI’s business continues to rely on its paid API services and ChatGPT subscriptions. The gpt-oss models offer developers a choice: fully customizable, self-hosted models for specific needs, or OpenAI’s API models for multimodal capabilities, built-in tools, and seamless platform integration.

Sources: OpenAI, Wired, VentureBeat

OpenAI releases its first open-weight models since GPT-2

Model capabilities and performance

A focus on safety and transparency

Strategic motivations and market context

Related posts:

Stay up-to-date: