Nvidia unveils Llama Nemotron models to advance AI agents and reasoning capabilities

At the GPU Technology Conference (GTC) 2025, Nvidia announced a new family of AI models called Llama Nemotron designed to enhance reasoning capabilities for autonomous AI agents. These models are based on Meta’s open-source Llama models but have been refined through post-training optimization techniques to improve their performance in complex tasks such as multistep math, coding, and decision-making.

The Llama Nemotron family comes in three different sizes, each optimized for different deployment scenarios:

  • Nemotron Nano: Designed for edge devices and personal computers
  • Nemotron Super: Optimized to run on a single GPU
  • Nemotron Ultra: Built for maximum performance on multiple GPU servers

According to Nvidia, these models are 20% more accurate than the original Llama models while offering five times faster inference speed. A key feature is the ability to toggle reasoning on or off depending on the task, allowing systems to bypass computationally expensive reasoning steps for simple queries.

Building blocks for AI agents

Beyond the models themselves, Nvidia introduced several components to support the development of advanced AI agents:

The company unveiled the AI-Q Blueprint, an open-source framework that helps developers connect AI agents to enterprise systems and various data sources. This framework integrates with Nvidia NeMo Retriever, making it easier for AI agents to retrieve multimodal data in different formats.

Nvidia also announced the AI Data Platform, a reference design for storage providers like Dell, HPE, IBM, and NetApp to develop more efficient data platforms for AI workloads. By combining optimized storage resources with Nvidia’s accelerated computing hardware, the platform aims to ensure smooth data flow from databases to models.

New hardware and partnerships

Complementing these software announcements, Nvidia CEO Jensen Huang detailed the company’s hardware roadmap. The Blackwell platform, which Huang claimed delivers 40 times the AI performance of its predecessor Hopper, is now in “full production.” Looking ahead, Huang unveiled plans for the next-generation Rubin architecture, with Blackwell Ultra coming in late 2025, followed by Vera Rubin in 2026.

Nvidia also expanded its partnerships with major technology companies. It’s working with Oracle to bring agentic AI to Oracle Cloud Infrastructure, with Google DeepMind to optimize the Gemma family of AI models, and with General Motors to build autonomous vehicles.

The company is making a significant push into robotics and physical AI with the announcement of Isaac Groot N1, described as “the world’s first open, fully customizable foundation model for generalized humanoid reasoning and skills.”

These announcements collectively show Nvidia’s strategy to maintain its position in AI computing infrastructure while expanding into new areas where its technology can create value, from data centers to vehicles and robots.

Sources: Silicon Angle, VentureBeat, VentureBeat

Related posts:

Stay up-to-date: