Nvidia releases powerful and open AI model

Nvidia has introduced a new AI model, Llama-3.1-Nemotron-70B-Instruct, which outperforms existing models from OpenAI and others, continuing a significant shift in its AI strategy. The model, available on Hugging Face, achieved impressive benchmark scores, positioning Nvidia as a competitive player in AI language understanding and generation. This development showcases Nvidia’s transition from a GPU manufacturer …

Read more

Is Meta diluting the term Open Source?

Meta is facing criticism for labeling its AI models as “open source.” Stefano Maffulli, head of the Open Source Initiative, accuses the company of diluting the term and confusing users. According to Maffulli, Meta’s Llama models do not meet the criteria for true open-source software, Richard Waters reports for the Financial Times. Meta defends its …

Read more

Endor Labs scores open source AI models

Endor Labs has launched a new platform to score over 900,000 open-source AI models available on Hugging Face, focusing on security, activity, quality, and popularity. This initiative aims to address concerns regarding the trustworthiness and security of AI models, which often have complex dependencies and vulnerabilities, reports VentureBeat. Developers can query the platform about model …

Read more

Zamba2-7B is especially efficient

Zyphra has released Zamba2-7B, a new small language model supposedly outperforming competitors like Mistral, Google’s Gemma, and Meta’s Llama3 in quality and performance. According to the Zyphra team, Zamba2-7B is ideal for consumer devices, GPUs, and enterprise applications. It boasts 25% faster time to first token, 20% more tokens per second, and reduced memory usage …

Read more

ARIA is open and natively multimodal

ARIA is an open multimodal native mixture-of-experts model designed to integrate diverse forms of information for comprehensive understanding, outperforming existing proprietary models in various tasks. With 24.9 billion total parameters, it activates 3.9 billion and 3.5 billion parameters for visual and text tokens, respectively. The model is pre-trained on a substantial dataset comprising 6.4 trillion …

Read more

Meta shows hardware for AI training

Meta showcased its latest open AI hardware designs at the OCP Global Summit 2024. These include a new AI platform called “Catalina,” cutting-edge rack designs, and advanced network fabrics. According to Dan Rabinovitsj and Omar Baldonado in Meta’s “Engineering at Meta” blog, the company aims to foster collaboration and innovation. Meta is significantly scaling its …

Read more

INTELLECT-1 undergoes decentralized training

Decentralized training of a 10-billion-parameter model called INTELLECT-1 has begun. Anyone can contribute computing power and participate. INTELLECT-1 is based on the Llama-3 architecture and is trained on a high quality open source dataset called Fineweb-Edu by Hugging Face. The dataset contains over six trillion tokens and consists of Fineweb-edu (55%), DLCM (20%), Stack v2 …

Read more

Pyramid Flow is a freely available video AI

A new open-source AI model called Pyramid Flow generates high-quality video clips of up to ten seconds in length. It was developed by researchers from Peking University, Beijing University of Posts and Telecommunications, and Kuaishou Technology, as reported by Carl Franzen. Pyramid Flow uses a novel technique where an AI model creates videos in multiple …

Read more

Controversy surrounding Reflection 70B continues

The controversy surrounding the AI language model Reflection 70B continues to spark debate. Sahil Chaudhary, co-developer of the model, has now published a post-mortem report. In it, he admits mistakes in the rushed release and explains discrepancies between the originally claimed and actual performance data. According to Chaudhary, a programming error led to inflated results …

Read more

Nvidia surprises with powerful, open AI models

Nvidia has released a powerful open-source AI model that rivals proprietary systems from industry leaders like OpenAI and Google. The model, called NVLM 1.0, demonstrates exceptional performance in vision and language tasks while also enhancing text-only capabilities. Michael Nuñez reports on this development for VentureBeat. The main model, NVLM-D-72B, with 72 billion parameters, can process …

Read more