OpenAI launches o3 and o4-mini with enhanced reasoning and visual capabilities

OpenAI has released two new AI models, o3 and o4-mini, designed to advance reasoning capabilities and introduce novel features like “thinking with images.” These models represent the company’s latest development in its o-series, coming just days after the release of GPT-4.1.

The models’ most distinctive feature is their ability to not just recognize images but to integrate them directly into their reasoning process. According to OpenAI, these models “don’t just see an image — they think with it,” enabling them to analyze diagrams, sketches, and whiteboard content, even those of poor quality.

Both models can independently use all the tools available in ChatGPT, including web browsing, Python code execution, file analysis, and image generation. This marks a departure from previous models that required more direct human guidance for complex multi-step problems.

Performance and applications

OpenAI claims o3 shows particularly strong performance in coding, math, science, and visual tasks. In benchmarks like Codeforces, SWE-bench, and MMMU, the company reports that the model has set new state-of-the-art results. External evaluators found that o3 makes 20 percent fewer major errors than its predecessor, OpenAI o1, on difficult real-world tasks.

The smaller o4-mini model is designed for faster, more cost-efficient reasoning while maintaining strong capabilities in various domains. On the AIME 2025 mathematics competition, o4-mini reportedly achieved a 92.7% accuracy rate.

Dan Shipper from Every, who tested o3 before its public release, highlighted its versatility: “In just the last week, it flagged every single time I sidestepped conflict in my meeting transcripts, spun up a bite-size ML course it pings me about every morning, found a stroller brand from one blurry photo, coded a new custom AI benchmark, and X-rayed an Annie Dillard classic for writing tricks I’d never noticed before.”

Tool integration and coding capabilities

A key advancement is the models’ ability to chain together multiple tools when solving problems without constant human direction. Greg Brockman, OpenAI’s president, noted that “They actually use these tools in their chain of thought as they’re trying to solve a hard problem. For example, we’ve seen o3 use like 600 tool calls in a row trying to solve a really hard task.”

Alongside these models, OpenAI has introduced Codex CLI, a lightweight coding agent that runs directly in a user’s terminal. The company is supporting this tool with a $1 million initiative for projects that use Codex CLI with OpenAI models.

Safety and availability

OpenAI reports conducting extensive safety testing on the new models, with particular focus on their ability to refuse harmful requests. The company rebuilt its safety training data and developed system-level mitigations to flag dangerous prompts in frontier risk areas.

The models are immediately available to ChatGPT Plus, Pro, and Team users, with Enterprise and Education customers gaining access next week. Free users can access o4-mini by selecting “Think” in the composer before submitting queries. Developers can use both models via OpenAI’s Chat Completions API and Responses API, though some organizations will need verification.

According to OpenAI, these releases reflect the direction their models are heading: converging specialized reasoning capabilities with natural conversational abilities and tool use. The company suggests that this approach will lead to future models that support “seamless, natural conversations alongside proactive tool use and advanced problem-solving.”

Sources: OpenAI, Engadget, CNBC, Every, VentureBeat

Related posts:

Stay up-to-date: