Zamba2-7B is especially efficient
Zyphra has released Zamba2-7B, a new small language model supposedly outperforming competitors like Mistral, Google’s Gemma, and Meta’s Llama3 in quality and performance. According to the Zyphra team, Zamba2-7B is ideal for consumer devices, GPUs, and enterprise applications. It boasts 25% faster time to first token, 20% more tokens per second, and reduced memory usage … Read more