Developer | Page 6 of 13 | ✦ Smart Content Report

Tech companies develop new AI testing methods as models outgrow existing benchmarks

February 5, 2025November 14, 2024

Leading AI companies are creating new ways to evaluate increasingly sophisticated AI models as current testing methods prove inadequate. According to Cristina Criddle’s report in the Financial Times, companies like OpenAI, Microsoft, Meta, and Anthropic are developing internal benchmarks because their latest AI systems achieve over 90% accuracy on existing public tests. Meta’s generative AI …

OpenAI plans January launch of “Operator” AI agent

July 10, 2025November 14, 2024

OpenAI is preparing to launch a new AI agent called “Operator” that can perform automated tasks like coding and travel booking on behalf of users, according to reporting by Shirin Ghaffary and Rachel Metz for Bloomberg. The company plans to release the tool in January 2024 as both a research preview and through their developer …

Study reveals visual prompt injection vulnerabilities in GPT-4V

February 5, 2025November 14, 2024

A recent study by Lakera’s team demonstrates how GPT-4V can be manipulated through visual prompt injection attacks. As detailed by author Daniel Timbrell in his article, these attacks involve embedding text instructions within images to make AI models ignore their original programming or perform unintended actions. During Lakera’s internal hackathon, researchers successfully tested several methods, …

AI agents enhance traditional RAG systems for better data processing

February 5, 2025November 14, 2024

Retrieval-Augmented Generation (RAG) systems are being improved through the integration of AI agents, according to an analysis by Shubham Sharma for VentureBeat. While traditional RAG systems combine data retrieval with language models to provide contextual answers, they are limited to single knowledge sources. The new “agentic RAG” approach incorporates AI agents that can access multiple …

Box introduces new AI studio and enterprise application builder

February 5, 2025November 13, 2024

Box has unveiled two major AI-focused products: Box AI Studio for creating custom AI agents and Box Apps for building no-code enterprise applications. CEO Aaron Levie announced these tools at the BoxWorks event as part of the company’s expansion from file sharing into intelligent content management, VentureBeat reports. Box AI Studio, built on partnerships with …

Mistral AI launches multilingual content moderation API to tackle harmful content

February 5, 2025November 8, 2024

Mistral AI, a French artificial intelligence startup, has released a new content moderation API capable of detecting harmful content across nine categories in 11 languages. The API, powered by Mistral’s fine-tuned Ministral 8B model, offers both raw text and conversational content analysis, as reported by Michael Nuñez for VentureBeat. This launch positions Mistral to compete …

Microsoft unveils Magentic-One, an open-source framework for managing multi-agent AI systems

February 5, 2025November 8, 2024

Microsoft has released Magentic-One, a new open-source infrastructure that enables a single AI model to manage multiple helper agents working together to complete complex, multi-step tasks in various scenarios. According to a paper by Microsoft researchers, Magentic-One is a generalist agentic system that can “fully realize the long-held vision of agentic systems that can enhance …

Patronus AI launches API to prevent AI hallucinations in real-time

February 5, 2025November 8, 2024

Patronus AI, a San Francisco startup, has launched a self-serve API that detects and prevents AI failures, such as hallucinations and unsafe responses, in real-time. According to CEO Anand Kannappan in an interview with VentureBeat, the platform introduces several innovations, including “judge evaluators” that allow companies to create custom rules in plain English and Lynx, …

Google launches real-time search for Gemini AI

February 5, 2025November 7, 2024

Google has introduced “Grounding with Google Search” for its Gemini AI platform, allowing developers to enhance their AI applications with current information from Google Search. As reported by VentureBeat’s Michael Nuñez, the service launched just hours before OpenAI’s consumer-focused ChatGPT Search. Google’s offering targets developers and costs $35 per 1,000 queries, while OpenAI’s service is …

OpenAI expands Realtime API with new voices and reduces costs for developers

February 5, 2025October 31, 2024

OpenAI has updated its Realtime API, currently in beta, with five new expressive voices for speech-to-speech applications and reduced costs for developers by introducing prompt caching. According to OpenAI’s API documentation cited in an article by VentureBeat, the native speech-to-speech feature enables low latency and nuanced output. The company showcased three of the new voices …