GPT-4.5: A different kind of intelligence with high costs and mixed reception

OpenAI’s latest language model, GPT-4.5, has generated significant discussion in the AI community since its release. While it represents OpenAI’s largest and most knowledgeable model to date, its practical value remains contested among experts and users. A costly advancement GPT-4.5 comes with a steep price tag: approximately 10 to 20 times more expensive than Claude …

Read more

OpenAI to integrate Sora video generator into ChatGPT

OpenAI plans to incorporate its AI video generation tool Sora directly into ChatGPT, according to company leaders during a Friday Discord session. As reported by Maxwell Zeff, Sora is currently only available through a dedicated web app launched in December. Rohan Sahai, OpenAI’s product lead for Sora, indicated that while the ChatGPT integration is being …

Read more

Microsoft brings Copilot app to Mac with new features

Microsoft has launched a native Copilot app for macOS users in the US, UK, and Canada. According to Tom Warren from The Verge, the app provides access to Microsoft’s web-based AI assistant, allowing users to generate images and text or upload images. The Mac version includes dark mode support and can be activated with Command …

Read more

These diffusion-based language models run 10 times faster than current LLMs

Inception Labs has unveiled Mercury, a new family of diffusion-based large language models (dLLMs) that can generate text up to 10 times faster than conventional autoregressive LLMs. According to the company, Mercury models can process over 1,000 tokens per second on NVIDIA H100 GPUs, speeds previously achievable only with specialized hardware. The company’s first publicly …

Read more

You.com’s AI research tool processes 400+ sources simultaneously

You.com has unveiled a new AI research tool called Advanced Research & Insights agent (ARI) that can analyze more than 400 sources at once. According to CEO Richard Socher, interviewed by Michael Nuñez for VentureBeat, the tool aims to transform market research by producing comprehensive reports in minutes instead of weeks. ARI features direct source …

Read more

ElevenLabs launches Scribe with record 96.7% accuracy for English speech-to-text

ElevenLabs has released Scribe v1, a new speech-to-text model achieving record accuracy rates across 99 languages. According to Carl Franzen of VentureBeat, the model outperforms competitors from Google, OpenAI, and Deepgram with a 96.7% accuracy rate for English. Scribe can distinguish up to 32 different speakers in a single audio file and detect non-verbal elements …

Read more

IBM Granite 3.2 introduces conditional reasoning for enterprise AI

IBM has released its Granite 3.2 large language model family featuring a new approach called conditional reasoning. According to Sean Michael Kerner of VentureBeat, this update embeds reasoning capabilities directly into core models rather than creating separate reasoning models. The system allows users to activate reasoning only when needed, improving efficiency for complex tasks. Granite …

Read more

Alibaba’s video and image AI model Wan 2.1 now open source

Alibaba Group has made its video and image generation AI model Wan 2.1 publicly available as open source. Reuters reports that four variants of the model are now accessible globally through Alibaba Cloud’s ModelScope and HuggingFace platforms for academic, research, and commercial use. The most powerful variants can process up to 14 billion parameters, enabling …

Read more

Sesame introduces conversational AI assistant with natural voice presence

Sesame, a startup led by Oculus co-founder Brendan Iribe, has unveiled a new AI voice assistant called Maya that aims to cross “the uncanny valley of conversational voice.” According to a recent article by technology journalist Sean Hollister, Maya offers more natural and engaging conversations compared to existing voice assistants like Amazon’s Alexa or Google’s …

Read more

Hume AI launches Octave, a text-to-speech model with emotional controls

Hume AI has introduced Octave, a new text-to-speech system that can generate emotionally nuanced AI voices for content creation. As reported by Carl Franzen for VentureBeat, this large language model can adjust tone, rhythm, and cadence based on textual context. Users can fine-tune emotions at the sentence level through simple text prompts like “happier” or …

Read more

×