DeepSeek Janus Pro image generator challenges established competitors

Chinese AI company DeepSeek has released a new family of AI models called Janus-Pro, with capabilities in both image analysis and creation. The models, ranging from 1 billion to 7 billion parameters, are available for download on the Hugging Face platform under an MIT license, allowing unrestricted commercial use. According to DeepSeek, the largest model …

Read more

OpenAI launches Operator, an AI agent that automates web-based tasks

OpenAI has introduced Operator, an AI-powered agent capable of performing web-based tasks autonomously through its own browser interface. The tool, currently available as a research preview to ChatGPT Pro subscribers in the United States, represents the company’s first venture into AI agents that can interact directly with computer interfaces. The system is powered by a …

Read more

Hugging Face launches compact AI models for image and text analysis

Hugging Face has released two new AI models designed for processing images, videos, and text on devices with limited resources. As Kyle Wiggers reports for TechCrunch, the models called SmolVLM-256M and SmolVLM-500M require less than 1GB of RAM to operate. The models, containing 256 million and 500 million parameters respectively, can describe images, analyze video …

Read more

ByteDance launches UI-TARS, an AI that operates computer systems autonomously

ByteDance has developed UI-TARS, a new AI system that can autonomously control computers and mobile devices to perform complex tasks. According to research published on Arxiv and reported by Taryn Plumb for VentureBeat, the system outperforms existing AI models like GPT-4o and Claude across multiple benchmarks. UI-TARS uses both 7B and 72B parameter versions and …

Read more

Google launches Gemini 2.0 Flash Thinking for free

Google has released Gemini 2.0 Flash Thinking, a new AI model that can process up to one million tokens of text while showing its reasoning process. According to Michael Nuñez at VentureBeat, the model is available for free through Google AI Studio under the experimental designation “Exp-01-21.” The system achieved a 73.3% score on the …

Read more

OpenAI to launch browser control AI assistant Operator

OpenAI is set to release Operator, a new AI tool that can perform tasks in users’ web browsers. According to a report by Thomas Maxwell on Gizmodo, the launch is expected this week. The system will be able to navigate websites and complete actions like searching for flights or composing emails, though users must approve …

Read more

Perplexity launches real-time AI search API with two pricing tiers

Perplexity has introduced Sonar, an API service that enables developers to integrate AI-powered search capabilities into their applications. According to Maxwell Zeff’s article on the launch, the service offers two distinct pricing tiers: Sonar and Sonar Pro. The base version provides faster, more affordable searches, while Pro delivers more detailed answers for complex queries with …

Read more

Tencent releases AI tool that creates 3D models in seconds

Tencent has launched Hunyuan3D 2.0, an artificial intelligence system that generates detailed 3D models from single images or text descriptions. As Michael Nuñez reports, the system can complete tasks in seconds that typically take artists days or weeks. The technology combines two main components: Hunyuan3D-DiT for basic shapes and Hunyuan3D-Paint for surface details. According to …

Read more

Google Gemini assistant expands capabilities with multi-app support

Google has announced significant updates to its AI assistant Gemini, coinciding with Samsung’s Galaxy S25 launch event. The most notable change enables Gemini to execute tasks across multiple applications in a single interaction, while also becoming the default assistant on Samsung’s new flagship phones, replacing Bixby. The enhanced Gemini Live feature now supports the integration …

Read more

DeepSeek releases new reasoning models and introduces distilled versions

Chinese AI company DeepSeek has announced the release of its new reasoning-focused language models DeepSeek-R1-Zero and DeepSeek-R1, along with six smaller distilled versions. The main models, built on DeepSeek’s V3 architecture, feature 671 billion total parameters with 37 billion activated parameters and a context length of 128,000 tokens. According to company statements, DeepSeek-R1 achieves performance …

Read more