The Business & Technology Network
Helping Business Interpret and Use Technology
«  
  »
S M T W T F S
 
 
 
 
 
 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
 
23
 
24
 
25
 
26
 
27
 
28
 
29
 
30
 
31
 
 
 
 
 
 

Llama Nemotron: Nvidia’s answer to the AI reasoning boom

DATE POSTED:March 19, 2025
 Nvidia’s answer to the AI reasoning boom

Nvidia has introduced a new set of open source Llama Nemotron reasoning models during its GTC event, aimed at enhancing agentic AI workloads. These models build upon the Nvidia Nemotron models first announced at the Consumer Electronics Show (CES) in January.

Nvidia unveils open source Llama Nemotron models for advanced AI reasoning

The launch of the Llama Nemotron reasoning models is partially a response to the surge in reasoning models witnessed in 2025. Nvidia’s market position faced challenges earlier this year with the introduction of DeepSeek R1, which promised an open source reasoning model with superior performance.

The Llama Nemotron family is designed to provide competitive business-ready AI reasoning models for advanced agents. “Agents are autonomous software systems designed to reason, plan, act and critique their work,” stated Kari Briski, vice president of Generative AI Software Product Management at Nvidia. She emphasized that agents, like humans, require contextual understanding to break down complex requests, grasp user intent, and adapt in real time.

Key features of Llama Nemotron models

The Llama Nemotron models are based on Meta’s open source Llama models. Nvidia optimized these models by algorithmically pruning them to meet compute requirements while retaining accuracy. The company also utilized advanced post-training techniques with synthetic data, amounting to 360,000 H100 inference hours and 45,000 human annotation hours to improve reasoning capabilities. The training efforts have led to models that excel in benchmarks related to math, tool calling, instruction following, and conversational tasks.

The Llama Nemotron family consists of three distinct models, each targeting different deployment scenarios:

  • Nemotron Nano: Designed for edge and smaller deployments, maintaining high reasoning accuracy.
  • Nemotron Super: Balanced for peak throughput and accuracy on single data center GPUs.
  • Nemotron Ultra: Tailored for maximum “agentic accuracy” in multi-GPU data center environments.

Nemotron Nano and Super models are currently available through NIM micro services and can be downloaded from AI.NVIDIA.com. The Ultra model is expected to be released soon.

GM partners with Nvidia to revolutionize self-driving cars

A significant feature of the Llama Nemotron models is the ability to toggle reasoning on or off. This emerging capability allows systems to avoid costly reasoning processes for straightforward queries. For instance, during a demonstration, Nvidia illustrated how the model could perform complex reasoning for a combinatorial problem while shifting to direct response mode for basic factual questions.

Nvidia Agent AI-Q blueprint introduced

Nvidia also unveiled the Agent AI-Q blueprint, an open-source framework designed to integrate AI agents with enterprise systems and data sources. “AI-Q is a new blueprint that enables agents to query multiple data types—text, images, video—and leverage external tools like web search and other agents,” explained Briski. The framework aims to enhance observability and transparency for teams using connected agents, allowing developers to refine the system over time.

The AI-Q blueprint is scheduled to be available in April.

Nvidiа’s Llama Nemotron models provide enterprises a chance to deploy reasoning-capable AI within their infrastructures, addressing data sovereignty and privacy issues commonly associated with cloud-only solutions. This initiative facilitates smoother deployment and management whether on-premises or in the cloud. The hybrid, conditional reasoning option allows organizations to prioritize thoroughness or speed, optimizing latency and compute for simpler tasks while supporting complex reasoning as necessary. As enterprises evolve towards more intricate AI applications, Nvidia’s combination of efficient reasoning models and integration frameworks positions them for deploying sophisticated AI agents capable of multi-step logical problem-solving.

Featured image credit: Nvidia