Chinese AI startup DeepSeek released two new language models on December 1, 2025, claiming they match the performance of OpenAI’s GPT-5 and Google’s Gemini-3.0-Pro on multiple benchmarks. The company made both models freely available under an open-source MIT license.
DeepSeek-V3.2 serves as an everyday reasoning assistant, while DeepSeek-V3.2-Speciale focuses on advanced mathematical and coding tasks. The Hangzhou-based company states that the Speciale variant achieved gold-medal performance in four international competitions: the 2025 International Mathematical Olympiad, the International Olympiad in Informatics, the ICPC World Finals, and the China Mathematical Olympiad.
Technical architecture reduces costs
The new models build on DeepSeek Sparse Attention, an efficiency mechanism that reduces computational requirements for processing long documents. According to the company’s technical report, processing 128,000 tokens costs approximately $0.70 per million tokens, compared to $2.40 for the previous model version. This represents a 70 percent cost reduction.
Traditional attention mechanisms scale poorly as input length increases. DeepSeek’s approach uses what the company calls a “lightning indexer” to identify relevant portions of context while ignoring the rest. The 685-billion-parameter models support context windows of 128,000 tokens, roughly equivalent to a 300-page book.
Benchmark results and competition performance
On the AIME 2025 mathematics competition, DeepSeek-V3.2-Speciale achieved a 96.0 percent pass rate. The company reports this compares to 94.6 percent for GPT-5-High and 95.0 percent for Gemini-3.0-Pro. On the Harvard-MIT Mathematics Tournament, the Speciale variant scored 99.2 percent.
The model scored 35 out of 42 points on the 2025 International Mathematical Olympiad and 492 out of 600 points at the International Olympiad in Informatics. At the ICPC World Finals, it solved 10 of 12 problems. DeepSeek states these results came without internet access or tools during testing.
On coding tasks, DeepSeek-V3.2 resolved 73.1 percent of software bugs on SWE-Verified, compared to 74.9 percent for GPT-5-High according to the company. The technical report acknowledges that “token efficiency remains a challenge” and that the model “typically requires longer generation trajectories” than some competitors.
Tool integration and training approach
DeepSeek-V3.2 introduces what the company calls “thinking in tool-use,” allowing the model to maintain its reasoning process while executing code, searching the web, and manipulating files. Previous models typically lost their reasoning context when calling external tools.
The company built a synthetic data pipeline generating over 1,800 task environments and 85,000 complex instructions to train this capability. Tasks included multi-day trip planning with budget constraints and software bug fixes across eight programming languages.
Regulatory challenges and market implications
DeepSeek faces regulatory obstacles in multiple jurisdictions. Berlin’s data protection commissioner declared in June 2025 that DeepSeek’s transfer of German user data to China violates EU rules. Italy ordered the company to block its app in February 2025. U.S. lawmakers have moved to ban the service from government devices.
The company has not disclosed what hardware powered the training of V3.2. Its original V3 model reportedly trained on approximately 2,000 Nvidia H800 chips, hardware now restricted for export to China.
DeepSeek provides full model weights, training code, and documentation on Hugging Face. The company has included Python scripts demonstrating how to encode messages in OpenAI-compatible format. DeepSeek-V3.2-Speciale remains available through an API until December 15, 2025, when its capabilities merge into the standard release.
Sources: DeepSeek, VentureBeat, Bloomberg