Alibaba releases new Qwen3.5 models that run on laptops and phones

Alibaba’s Qwen research team has released two new series of open source AI models that run on consumer hardware, from desktop PCs to smartphones. The releases cover a range of sizes, from 0.8 billion to 122 billion parameters, and are available for free download under the Apache 2.0 license on Hugging Face and ModelScope.

The larger Qwen3.5 Medium series includes four models. The flagship, Qwen3.5-35B-A3B, uses a technique called Mixture-of-Experts, which activates only 3 billion of its 35 billion parameters at a time. According to Alibaba, this allows the model to run on a consumer GPU with 32GB of video memory while handling documents up to one million tokens long. On third-party benchmarks, Alibaba says the model outperforms Anthropic’s Claude Sonnet 4.5 and OpenAI’s GPT-5-mini.

The smaller Qwen3.5 Small series pushes efficiency further. The 9B model reportedly outperforms OpenAI’s open source gpt-oss-120B on several benchmarks, including graduate-level reasoning and multilingual knowledge, despite being more than thirteen times smaller. Developers have confirmed that the models run locally on standard laptops, including Apple M1 MacBook Airs, and that the smallest versions can run inside a web browser.

All open source models support text and image understanding. Alibaba also offers a hosted API version, Qwen3.5-Flash, at $0.10 per million input tokens and $0.40 per million output tokens.

Sources: VentureBeat, VentureBeat

About the author

Related posts:

Stay up-to-date:

Advertisement