Hugging Face tries to replicate DeepSeek’s R1 as open source

Researchers at Hugging Face have launched a project to create an open-source version of DeepSeek’s R1 AI reasoning model. As reported by Kyle Wiggers for TechCrunch, the initiative called Open-R1 aims to duplicate all components of the original model, including training data and methods. Led by Hugging Face’s head of research Leandro von Werra, the … Read more

DeepSeek Janus Pro image generator challenges established competitors

Chinese AI company DeepSeek has released a new family of AI models called Janus-Pro, with capabilities in both image analysis and creation. The models, ranging from 1 billion to 7 billion parameters, are available for download on the Hugging Face platform under an MIT license, allowing unrestricted commercial use. According to DeepSeek, the largest model … Read more

Analysis: DeepSeek R1’s breakthrough in cost and performance

DeepSeek, a Chinese AI company, has disrupted the artificial intelligence landscape with its newly released R1 model, which matches the performance of OpenAI’s o1 at approximately 3-5% of the cost. The model, launched on January 20, 2025, has quickly become the most downloaded AI model on HuggingFace with over 109,000 downloads, demonstrating significant developer interest. … Read more

Chinese AI startup DeepSeek challenges industry giants with open-source model

DeepSeek, a Chinese artificial intelligence company, has gained significant attention in the tech industry with the release of its DeepSeek-R1 language model. The model, developed by hedge fund manager Liang Wenfeng’s team, reportedly matches the performance of OpenAI’s leading model while being trained at a fraction of the cost – approximately $5.6 million using 2,048 … Read more

DeepSeek releases new reasoning models and introduces distilled versions

Chinese AI company DeepSeek has announced the release of its new reasoning-focused language models DeepSeek-R1-Zero and DeepSeek-R1, along with six smaller distilled versions. The main models, built on DeepSeek’s V3 architecture, feature 671 billion total parameters with 37 billion activated parameters and a context length of 128,000 tokens. According to company statements, DeepSeek-R1 achieves performance … Read more

Tested: DeepSeek-V3 matches top AI models at lower cost

A detailed analysis published by Sunil Kumar Dash reveals that DeepSeek’s latest AI model achieves performance comparable to leading closed-source models while offering significant cost advantages. The model outperforms existing open-source alternatives in mathematics and reasoning tasks, according to extensive benchmark testing. The analysis demonstrates that DeepSeek-V3 surpasses GPT-4 and Claude 3.5 Sonnet in mathematical … Read more

Open model DeepSeek-V3 performs similar to closed competition

Chinese AI startup DeepSeek has launched DeepSeek-V3, a powerful new AI model that outperforms existing open-source alternatives. According to reporting by Shubham Sharma at VentureBeat, the model features 671 billion parameters but activates only 37 billion for each task through its mixture-of-experts architecture. The model was trained on 14.8 trillion diverse tokens and demonstrates superior … Read more

Chinese AI company DeepSeek launches new reasoning model to compete with OpenAI

DeepSeek, a Chinese AI research company backed by hedge fund High-Flyer Capital Management, has released DeepSeek-R1-Lite-Preview, a new AI model designed to rival OpenAI’s o1. The model specializes in reasoning capabilities, allowing it to spend extended time considering questions before providing answers. According to DeepSeek, their model performs comparably to OpenAI’s o1 on established AI … Read more