Ai2 Tulu 3 is an open-source language model rivaling leading systems

The Allen Institute for Artificial Intelligence (Ai2) has released Tulu 3 405B, a new AI language model that, according to the institute’s internal testing, outperforms several leading AI systems including DeepSeek V3 and matches capabilities with OpenAI’s GPT-4o on certain benchmarks. The model contains 405 billion parameters and required 256 GPUs running in parallel for …

Read more

Mistral Small 3 rivals larger competitors

French startup Mistral AI has announced the release of Mistral Small 3, a 24-billion-parameter language model that the company claims matches the performance of models three times its size. According to Mistral AI, the new model achieves 81% accuracy on standard benchmarks while processing 150 tokens per second, making it comparable to Meta’s Llama 3.3 …

Read more

DeepSeek-R1 brings significant cost reduction for Enterprise AI

DeepSeek’s new AI reasoning model R1 could substantially reduce the costs of developing AI applications. According to an analysis by Ben Dickson in VentureBeat, DeepSeek-R1 offers similar capabilities to leading models at a fraction of the price. The model costs $2.19 per million output tokens, compared to OpenAI’s o1 at $60 per million tokens. When …

Read more

Hugging Face tries to replicate DeepSeek’s R1 as open source

Researchers at Hugging Face have launched a project to create an open-source version of DeepSeek’s R1 AI reasoning model. As reported by Kyle Wiggers for TechCrunch, the initiative called Open-R1 aims to duplicate all components of the original model, including training data and methods. Led by Hugging Face’s head of research Leandro von Werra, the …

Read more

DeepSeek Janus Pro image generator challenges established competitors

Chinese AI company DeepSeek has released a new family of AI models called Janus-Pro, with capabilities in both image analysis and creation. The models, ranging from 1 billion to 7 billion parameters, are available for download on the Hugging Face platform under an MIT license, allowing unrestricted commercial use. According to DeepSeek, the largest model …

Read more

Analysis: DeepSeek R1’s breakthrough in cost and performance

DeepSeek, a Chinese AI company, has disrupted the artificial intelligence landscape with its newly released R1 model, which matches the performance of OpenAI’s o1 at approximately 3-5% of the cost. The model, launched on January 20, 2025, has quickly become the most downloaded AI model on HuggingFace with over 109,000 downloads, demonstrating significant developer interest. …

Read more

Chinese AI startup DeepSeek challenges industry giants with open-source model

DeepSeek, a Chinese artificial intelligence company, has gained significant attention in the tech industry with the release of its DeepSeek-R1 language model. The model, developed by hedge fund manager Liang Wenfeng’s team, reportedly matches the performance of OpenAI’s leading model while being trained at a fraction of the cost – approximately $5.6 million using 2,048 …

Read more

Tencent releases AI tool that creates 3D models in seconds

Tencent has launched Hunyuan3D 2.0, an artificial intelligence system that generates detailed 3D models from single images or text descriptions. As Michael Nuñez reports, the system can complete tasks in seconds that typically take artists days or weeks. The technology combines two main components: Hunyuan3D-DiT for basic shapes and Hunyuan3D-Paint for surface details. According to …

Read more

DeepSeek releases new reasoning models and introduces distilled versions

Chinese AI company DeepSeek has announced the release of its new reasoning-focused language models DeepSeek-R1-Zero and DeepSeek-R1, along with six smaller distilled versions. The main models, built on DeepSeek’s V3 architecture, feature 671 billion total parameters with 37 billion activated parameters and a context length of 128,000 tokens. According to company statements, DeepSeek-R1 achieves performance …

Read more

MiniMax AI model has record-breaking 4 million token context

Singapore-based AI company MiniMax has launched a new open-source language model that can process up to 4 million tokens at once, doubling the previous record. According to Carl Franzen’s report in VentureBeat, the MiniMax-01 series includes both text and visual capabilities. The model uses an innovative “Lightning Attention” architecture and mixture of experts framework with …

Read more