Zamba2-7B is especially efficient

February 5, 2025October 16, 2024 by SCR

Zyphra has released Zamba2-7B, a new small language model supposedly outperforming competitors like Mistral, Google’s Gemma, and Meta’s Llama3 in quality and performance. According to the Zyphra team, Zamba2-7B is ideal for consumer devices, GPUs, and enterprise applications.

It boasts 25% faster time to first token, 20% more tokens per second, and reduced memory usage compared to models like Llama3-8B. Architectural improvements over its predecessor, Zamba1-7B, include two shared attention blocks instead of one and LoRA projectors for each shared MLP block. Trained on a 3 trillion token dataset and refined with an “annealing” phase, Zamba2-7B is available open-source under the Apache 2.0 license.

Tags: Open Source, Text, Zyphra

Stay up-to-date:

Newsletter

RSS Feed

Note: The author name SCR marks content created with the help of AI. Each article is checked and edited before publication. Editorial responsibility: Jan Tissler. Read more about how this website is made and which prompts are used.

Related posts:

Stay up-to-date: