IBM has released a new version of its open-source large language models, Granite 3.1, featuring significant improvements in performance and capabilities. According to reporting by Sean Michael Kerner for VentureBeat, the new models offer extended context length and integrated hallucination detection.
The Granite 8B Instruct model reportedly outperforms similar-sized competitors including Meta Llama 3.1 and Google Gemma 2 on academic benchmarks. The models now support a context length of 128,000 tokens, up from the previous 4,000, enabling processing of longer documents and conversations. IBM has integrated hallucination protection directly into the model through the Granite Guardian 3.1 variants, rather than relying on external guardrails.
The company’s VP for AI models, David Cox, emphasizes that the improvements come from better training pipelines and higher quality data rather than simply increasing data quantity. The new release includes embedding models, with the Granite-Embedding-30M-English achieving query processing speeds of 0.16 seconds. IBM plans to add multimodal functionality in early 2025 with Granite 3.2. The models are available as open source and through IBM’s Watsonx enterprise AI service. This release follows Granite 3.0, which was introduced in October, demonstrating IBM’s accelerated development pace in enterprise AI solutions.