Nvidia and DataStax launch storage-efficient AI retrieval system

Nvidia and DataStax have introduced a new AI technology that reduces data storage requirements by 35 times for generative AI systems. As reported by Michael Nuñez for VentureBeat, the Nvidia NeMo Retriever microservices, integrated with DataStax’s AI platform, enables faster and more accurate information retrieval across multiple languages. The technology has already shown impressive results …

Read more

Cohere launches new compact AI language model Command R7B

AI company Cohere has introduced Command R7B, a new compact language model designed for enterprise applications. According to VentureBeat reporter Taryn Plumb, the model supports 23 languages and specializes in retrieval-augmented generation (RAG). Command R7B outperforms similar-sized models from competitors like Google, Meta, and Mistral in mathematics and coding tasks. The model features a 128K …

Read more

Sakana AI develops new memory optimization for language models

Tokyo-based startup Sakana AI has created a breakthrough technique that reduces memory usage in large language models by up to 75%. As reported by Ben Dickson, the system called “universal transformer memory” uses neural attention memory modules (NAMMs) to efficiently manage information processing. These modules analyze the model’s attention layers to determine which information to …

Read more

Lambda launches new AI inference service with competitive pricing

Lambda, a San Francisco-based technology company, has introduced a new AI inference API service that promises the lowest costs in the industry. According to VentureBeat reporter Carl Franzen, the service allows enterprises to deploy AI models without managing computing infrastructure. The API supports various advanced models including Meta’s Llama 3.3 and Alibaba’s Qwen 2.5, with …

Read more

Microsoft’s Phi-4 AI model achieves high performance with fewer resources

Microsoft has introduced a new AI model that delivers superior mathematical reasoning capabilities while using significantly less computing power than larger competitors. According to Michael Nuñez’s report in VentureBeat, the 14-billion-parameter Phi-4 model outperforms larger systems like Google’s Gemini Pro 1.5. The model excels particularly in mathematical problem-solving, achieving top scores on standardized math competition …

Read more

ServiceNow releases open-source AI training accelerator

ServiceNow has launched Fast-LLM, an open-source framework that speeds up artificial intelligence model training by 20%. As reported by Sean Michael Kerner for VentureBeat, the technology has already proven successful in training ServiceNow’s StarCoder 2 language model. Fast-LLM introduces two key innovations: “Breadth-First Pipeline Parallelism” for optimized computation ordering and improved memory management that reduces …

Read more

Meta’s Llama 3.3 70B model runs GPT-4 level AI on high-end laptops

Meta has released Llama 3.3 70B, a new large language model that achieves GPT-4 level performance while running on high-end consumer laptops. The breakthrough was documented by developer Simon Willison testing the model on a 64 GB MacBook Pro M2, demonstrating capabilities comparable to much larger models like Meta’s own Llama 3.1 405B. The model …

Read more

AI coding tools show limitations despite productivity gains

A comprehensive analysis reveals that artificial intelligence-assisted coding tools, while boosting developer productivity, are not necessarily leading to better software quality. Software engineer Addy Osmani writes that AI tools can help developers achieve about 70% of a project quickly but struggle with the crucial final 30% that makes software production-ready. The report identifies that experienced …

Read more

New AI architecture STAR reduces model cache size by 90 percent

MIT startup Liquid AI has developed a new AI framework called STAR (Synthesis of Tailored Architectures) that significantly improves upon traditional Transformer models. As reported by Carl Franzen for VentureBeat, the system uses evolutionary algorithms to automatically generate and optimize AI architectures. The STAR framework achieved a 90% reduction in cache size compared to traditional …

Read more

Hume AI releases voice customization tool for developers

Hume AI has launched Voice Control, a new feature that enables developers to create custom AI voices by adjusting vocal characteristics through an interface with sliding controls. As reported by Carl Franzen for VentureBeat, the tool allows users to modify voices along ten different dimensions including assertiveness, confidence, and enthusiasm without requiring coding skills. The …

Read more