ARIA is open and natively multimodal

ARIA is an open multimodal native mixture-of-experts model designed to integrate diverse forms of information for comprehensive understanding, outperforming existing proprietary models in various tasks. With 24.9 billion total parameters, it activates 3.9 billion and 3.5 billion parameters for visual and text tokens, respectively. The model is pre-trained on a substantial dataset comprising 6.4 trillion …

Read more

Meta shows hardware for AI training

Meta showcased its latest open AI hardware designs at the OCP Global Summit 2024. These include a new AI platform called “Catalina,” cutting-edge rack designs, and advanced network fabrics. According to Dan Rabinovitsj and Omar Baldonado in Meta’s “Engineering at Meta” blog, the company aims to foster collaboration and innovation. Meta is significantly scaling its …

Read more

INTELLECT-1 undergoes decentralized training

Decentralized training of a 10-billion-parameter model called INTELLECT-1 has begun. Anyone can contribute computing power and participate. INTELLECT-1 is based on the Llama-3 architecture and is trained on a high quality open source dataset called Fineweb-Edu by Hugging Face. The dataset contains over six trillion tokens and consists of Fineweb-edu (55%), DLCM (20%), Stack v2 …

Read more

Pyramid Flow is a freely available video AI

A new open-source AI model called Pyramid Flow generates high-quality video clips of up to ten seconds in length. It was developed by researchers from Peking University, Beijing University of Posts and Telecommunications, and Kuaishou Technology, as reported by Carl Franzen. Pyramid Flow uses a novel technique where an AI model creates videos in multiple …

Read more

Controversy surrounding Reflection 70B continues

The controversy surrounding the AI language model Reflection 70B continues to spark debate. Sahil Chaudhary, co-developer of the model, has now published a post-mortem report. In it, he admits mistakes in the rushed release and explains discrepancies between the originally claimed and actual performance data. According to Chaudhary, a programming error led to inflated results …

Read more

Nvidia surprises with powerful, open AI models

Nvidia has released a powerful open-source AI model that rivals proprietary systems from industry leaders like OpenAI and Google. The model, called NVLM 1.0, demonstrates exceptional performance in vision and language tasks while also enhancing text-only capabilities. Michael Nuñez reports on this development for VentureBeat. The main model, NVLM-D-72B, with 72 billion parameters, can process …

Read more

Open Source alternative to Google’s NotebookLM

A data scientist from Singapore has developed an open-source alternative to Google’s NotebookLM. Gabriel Chua of the GovTech agency created the tool called “Open NotebookLM” in just one afternoon. It converts PDF documents into personalized podcasts using publicly available AI models. The project demonstrates how quickly complex AI applications can be replicated today. However, the …

Read more

AI platform Hugging Face lists one million models

AI platform Hugging Face has reached a milestone of one million listed AI models. This was reported by Benj Edwards for Ars Technica. Launched in 2016 as a chatbot app, the platform has evolved into an open source hub for AI models. CEO Clément Delangue explains the boom with the customization of models for specific …

Read more

Meta Llama 3.2 is here

Meta has today released the new version of its AI model series: Llama 3.2, which for the first time includes vision models that can process both images and text. The larger versions with 11 and 90 billion parameters should be able to compete with closed systems like Claude 3 Haiku for image processing. Also new …

Read more

Molmo to improve AI agents

A new open-source AI model called Molmo could help advance the development of AI agents. Developed by the Allen Institute for AI (Ai2), the model can interpret images and communicate via a chat interface. According to Wired’s Will Knight, this enables AI agents to perform tasks such as web browsing or document creation. In some …

Read more