Fastino launches CPU-based AI models for enterprise tasks

A San Francisco-based startup Fastino has unveiled new task-optimized AI models that run efficiently on standard CPUs without requiring expensive GPUs. As reported by Sean Michael Kerner, the company has secured $7 million in pre-seed funding from investors including Microsoft’s Venture Fund M12 and Insight Partners. Fastino’s models differ from traditional large language models by … Read more

Companies advised to start small with multimodal RAG implementation

Multimodal retrieval augmented generation (RAG) systems are gaining traction as tools to process multiple data types, including text, images, and videos. According to Emilia David’s article on VentureBeat, embedding service providers recommend a cautious approach to implementation. Cohere, which recently updated its Embed 3 model, emphasizes the importance of thorough data preparation and initial testing … Read more

Hugging Face releases compact language models for smartphones and edge devices

Hugging Face has released SmolLM2, a new family of compact language models designed to run on smartphones and edge devices with limited processing power and memory. The models, released under the Apache 2.0 license, come in three sizes up to 1.7B parameters and achieve impressive performance on key benchmarks, outperforming larger models like Meta’s Llama … Read more

SambaNova and Hugging Face simplify AI chatbot deployment

SambaNova and Hugging Face have launched a new integration that allows developers to deploy AI chatbots with a single click, reportedly reducing deployment time from hours to minutes. According to Ahsen Khaliq, ML Growth Lead at Gradio, the process involves obtaining an access token from SambaNova Cloud’s API website and entering three lines of Python … Read more

Speech to text: Moonshine is fast and as accurate as OpenAI’s Whisper

Useful, an AI company focused on improving human-machine communication, has open-sourced Moonshine, a new speech-to-text model that aims to significantly reduce the latency of voice interfaces. According to Useful founder Pete Warden, Moonshine returns results 1.7 times faster than OpenAI’s Whisper model while matching or exceeding its accuracy. The model’s variable-length input window allows it … Read more

Meta releases AI models for mobile devices

Meta Platforms has released quantized versions of its Llama 3.2 1B and 3B models, which the company says offer reduced memory requirements, faster on-device inference, accuracy, and portability. The models were developed in close collaboration with Qualcomm and MediaTek and are available on SoCs with Arm CPUs. According to Meta, the average model size has … Read more

Qualcomm’s Snapdragon 8 Elite brings powerful AI to smartphones

Qualcomm has unveiled the Snapdragon 8 Elite, which the company claims is the world’s fastest mobile CPU. The chipset features Qualcomm’s second-generation Oryon CPU and is said to usher in a new era of on-device AI. As Dean Takahashi reports for VentureBeat, the processor enables complex multimodal AI applications directly on the smartphone, with a … Read more

ComfyUI V1: Create AI images directly on your own computer

The AI image generation software ComfyUI has been released in version 1.0 and now offers a desktop version for Windows, MacOS and Linux. Background: ComfyUI allows you to implement complex image generation pipelines directly on your own computer. Unlike cloud-based solutions, the software provides full control over the process and your own data. The brand … Read more

Nvidia releases powerful and open AI model

Nvidia has introduced a new AI model, Llama-3.1-Nemotron-70B-Instruct, which outperforms existing models from OpenAI and others, continuing a significant shift in its AI strategy. The model, available on Hugging Face, achieved impressive benchmark scores, positioning Nvidia as a competitive player in AI language understanding and generation. This development showcases Nvidia’s transition from a GPU manufacturer … Read more

Mistral unveils small models for laptops and smartphones

French AI company Mistral has introduced new generative AI models for laptops and smartphones. Known as “Les Ministraux,” the models are optimized for various applications such as text generation or collaboration with more powerful models. Kyle Wiggers reports for TechCrunch that two variants are available: Ministral 3B and Ministral 8B, both with a context window … Read more