LLaVA-o1 brings structured reasoning to visual language processing

Chinese researchers have developed LLaVA-o1, an open-source vision language model that introduces a four-stage reasoning process for analyzing images and text. As reported by Ben Dickson for VentureBeat, the model breaks down complex tasks into summary, caption, reasoning, and conclusion phases. The system, built on Llama-3.2-11B-Vision-Instruct and trained on 100,000 image-question-answer pairs, employs a novel …

Read more

Lightricks releases open-source AI video generation model

Israeli tech company Lightricks has launched LTX Video (LTXV), a new open-source AI model that generates five-second videos in just four seconds. As Michael Nuñez reports for VentureBeat, the company aims to challenge major tech firms by making its technology freely available. The model runs efficiently on consumer-grade hardware like Nvidia RTX 4090 GPUs while …

Read more

New AI system OpenScholar helps scientists process research papers

OpenScholar, a new open-source AI system developed by the Allen Institute for AI and the University of Washington, is transforming how researchers analyze scientific literature. As reported by Michael Nuñez for VentureBeat, the system processes over 45 million open-access academic papers to provide citation-backed answers to complex research questions. The AI combines advanced retrieval systems …

Read more

New AI model combines speech recognition with privacy protection

Israeli startup aiOla has released Whisper-NER, an open-source AI model that transcribes audio while automatically masking sensitive information. As reported by Carl Franzen for VentureBeat, the model builds upon OpenAI’s Whisper framework and combines automatic speech recognition with named entity recognition to protect private data during transcription. The tool can identify and obscure sensitive details …

Read more

Meta rebuilds company strategy around open-source AI model Llama

Meta has fundamentally transformed its business strategy by focusing on Llama, its open-source artificial intelligence model. According to Sharon Goldman’s detailed report in Fortune, CEO Mark Zuckerberg made the pivotal decision to release Llama 2 as open-source in July 2023, despite internal concerns about monetization and security risks. The model has since been downloaded over …

Read more

AnyChat unifies access to multiple AI language models

AnyChat, a new development tool, enables seamless integration of multiple large language models (LLMs) through a single interface. Developer Ahsen Khaliq, machine learning growth lead at Gradio, created the platform to allow users to switch between models like ChatGPT, Google’s Gemini, Perplexity, Claude, and Meta’s LLaMA without being restricted to one provider, as reported by …

Read more

Microsoft unveils Magentic-One, an open-source framework for managing multi-agent AI systems

Microsoft has released Magentic-One, a new open-source infrastructure that enables a single AI model to manage multiple helper agents working together to complete complex, multi-step tasks in various scenarios. According to a paper by Microsoft researchers, Magentic-One is a generalist agentic system that can “fully realize the long-held vision of agentic systems that can enhance …

Read more

OmniGen: First unified model for image generation

Researchers have introduced OmniGen, the first diffusion model capable of unifying various image generation tasks within a single framework. Unlike existing models like Stable Diffusion, OmniGen does not require additional modules to handle different control conditions, according to the authors Shitao Xiao, Yueze Wang, Junjie Zhou, Huaying Yuan, et al. The model can perform text-to-image …

Read more

Hugging Face releases compact language models for smartphones and edge devices

Hugging Face has released SmolLM2, a new family of compact language models designed to run on smartphones and edge devices with limited processing power and memory. The models, released under the Apache 2.0 license, come in three sizes up to 1.7B parameters and achieve impressive performance on key benchmarks, outperforming larger models like Meta’s Llama …

Read more

Meta makes Llama AI models available for US defense applications

Meta is making its Llama AI models available to U.S. government agencies and contractors working on defense and national security applications. According to a blog post by Meta cited by TechCrunch, the company is partnering with firms like Accenture, Amazon Web Services, and Lockheed Martin to bring Llama to these entities. The move comes after …

Read more