DeepSeek releases new reasoning models and introduces distilled versions

Chinese AI company DeepSeek has announced the release of its new reasoning-focused language models DeepSeek-R1-Zero and DeepSeek-R1, along with six smaller distilled versions. The main models, built on DeepSeek’s V3 architecture, feature 671 billion total parameters with 37 billion activated parameters and a context length of 128,000 tokens. According to company statements, DeepSeek-R1 achieves performance … Read more

Alibaba cuts prices on Qwen language model by 85 percent

Alibaba Cloud has announced a major price reduction of up to 85 percent on its Qwen-VL large language model, which processes both text and images. According to Ryan Browne from Reuters, this move reflects the intensifying competition in China’s AI market. The price cut follows earlier reductions of up to 97 percent in May. Alibaba’s … Read more

Alibaba releases new visual AI model QVQ for enhanced reasoning capabilities

Alibaba’s Qwen team has released QVQ-72B-Preview, a new experimental visual AI model designed to enhance visual reasoning capabilities. Built upon their Qwen2-VL-72B architecture, the model aims to combine language and vision processing to tackle complex analytical tasks. According to company statements, QVQ achieved a score of 70.3 on the MMMU benchmark, marking an improvement over … Read more

Alibaba releases new AI reasoning model to compete with OpenAI o1

Alibaba has released Qwen with Questions (QwQ), a new artificial intelligence reasoning model designed to compete with OpenAI’s o1 system. The model features 32 billion parameters and can process contexts of up to 32,000 tokens. According to Alibaba’s testing, QwQ outperforms OpenAI’s o1-preview on mathematical and scientific reasoning benchmarks AIME and MATH. The company states … Read more

Alibaba extends Qwen AI model to process one million tokens

Alibaba Cloud has launched an upgraded version of its Qwen2.5-Turbo AI model that can now process contexts of up to one million tokens, equivalent to approximately 1.5 million Chinese characters or 10 full-length novels. The improved model achieves 93.1 points on the RULER long text evaluation benchmark, surpassing GPT-4’s score of 91.6. According to Alibaba, … Read more

Arch-Function accelerates AI agents

Katanemo has introduced Arch-Function, a collection of open-source large language models (LLMs) designed for ultra-fast function-calling tasks essential for agentic applications in enterprises. According to reporting from VentureBeat, these models operate nearly 12 times faster than OpenAI’s GPT-4 and significantly outperform offerings from other competitors, while also providing substantial cost savings. Arch-Function builds on Katanemo’s … Read more

Alibaba unveils several new AI models

Alibaba is stepping up its AI activities with the release of new open-source models and text-to-video technology. The Chinese technology company has now unveiled over 100 new open source AI models from its Qwen 2.5 family, according to Reuters. The models cover a range of sizes and capabilities, including mathematics and programming. They support over … Read more

Chinese models lead Hugging Face ranking

Hugging Face’s new ranking of the best freely avaiable language models shows that Chinese models currently lead the way. Alibaba’s Qwen models dominate the top spots in the ranking, which is based on more challenging tests than its predecessor. Skills such as knowledge recall, inferring from long texts, complex mathematics, and following instructions are assessed.

This new AI model is particularly obedient

While commercial AI offerings have many guardrails and barriers to protect them from misuse, the open model Liberated-Qwen1.5-72B advertises that it has no such restrictions. Instead, it is specially trained to strictly follow the system prompt. This makes it harder to “jailbreak“. At the same time, you have to decide for yourself which answers and … Read more