Meta’s Llama 3.3 70B model runs GPT-4 level AI on high-end laptops

Meta has released Llama 3.3 70B, a new large language model that achieves GPT-4 level performance while running on high-end consumer laptops. The breakthrough was documented by developer Simon Willison testing the model on a 64 GB MacBook Pro M2, demonstrating capabilities comparable to much larger models like Meta’s own Llama 3.1 405B. The model …

Read more

New AI model from Hugging Face promises efficient image processing

Hugging Face has introduced SmolVLM, a new vision-language AI model that processes both images and text while using significantly less computing power than comparable solutions. As reported by Michael Nuñez, the model requires only 5.02 GB of GPU RAM, compared to competitors that need up to 13.70 GB. The system uses advanced compression technology to …

Read more

Lightricks releases open-source AI video generation model

Israeli tech company Lightricks has launched LTX Video (LTXV), a new open-source AI model that generates five-second videos in just four seconds. As Michael Nuñez reports for VentureBeat, the company aims to challenge major tech firms by making its technology freely available. The model runs efficiently on consumer-grade hardware like Nvidia RTX 4090 GPUs while …

Read more

Adobe develops AI system for offline document processing on smartphones

Adobe has created SlimLM, an AI system that can analyze and process documents directly on smartphones without requiring internet connectivity. According to Michael Nuñez, the system was successfully tested on Samsung’s Galaxy S24, where it demonstrated capabilities in document analysis, summarization, and complex question answering. The technology operates with a compact model of 125 million …

Read more

Fastino launches CPU-based AI models for enterprise tasks

A San Francisco-based startup Fastino has unveiled new task-optimized AI models that run efficiently on standard CPUs without requiring expensive GPUs. As reported by Sean Michael Kerner, the company has secured $7 million in pre-seed funding from investors including Microsoft’s Venture Fund M12 and Insight Partners. Fastino’s models differ from traditional large language models by …

Read more

Companies advised to start small with multimodal RAG implementation

Multimodal retrieval augmented generation (RAG) systems are gaining traction as tools to process multiple data types, including text, images, and videos. According to Emilia David’s article on VentureBeat, embedding service providers recommend a cautious approach to implementation. Cohere, which recently updated its Embed 3 model, emphasizes the importance of thorough data preparation and initial testing …

Read more

Hugging Face releases compact language models for smartphones and edge devices

Hugging Face has released SmolLM2, a new family of compact language models designed to run on smartphones and edge devices with limited processing power and memory. The models, released under the Apache 2.0 license, come in three sizes up to 1.7B parameters and achieve impressive performance on key benchmarks, outperforming larger models like Meta’s Llama …

Read more

SambaNova and Hugging Face simplify AI chatbot deployment

SambaNova and Hugging Face have launched a new integration that allows developers to deploy AI chatbots with a single click, reportedly reducing deployment time from hours to minutes. According to Ahsen Khaliq, ML Growth Lead at Gradio, the process involves obtaining an access token from SambaNova Cloud’s API website and entering three lines of Python …

Read more

Speech to text: Moonshine is fast and as accurate as OpenAI’s Whisper

Useful, an AI company focused on improving human-machine communication, has open-sourced Moonshine, a new speech-to-text model that aims to significantly reduce the latency of voice interfaces. According to Useful founder Pete Warden, Moonshine returns results 1.7 times faster than OpenAI’s Whisper model while matching or exceeding its accuracy. The model’s variable-length input window allows it …

Read more

Meta releases AI models for mobile devices

Meta Platforms has released quantized versions of its Llama 3.2 1B and 3B models, which the company says offer reduced memory requirements, faster on-device inference, accuracy, and portability. The models were developed in close collaboration with Qualcomm and MediaTek and are available on SoCs with Arm CPUs. According to Meta, the average model size has …

Read more