
NVIDIA CEO Jensen Huang, in his GTC Taipei 2026 keynote, laid out a system architecture definition for AI Agents: a large language model (LLM) handles thinking, reasoning, and planning, while an external orchestration engine (harness)—like an operating system—connects the model with tools such as spreadsheets, browsers, and databases, and manages working memory and long-term memory.
AI Agent architecture definition: a large model plus an orchestration engine
In his talk, Jensen Huang broke down the core architecture of an AI Agent into two parts: the large language model as the “thinking core,” responsible for reasoning and planning; the external orchestration engine acting as an operating system, connecting the model to various tools, while simultaneously managing short-term working memory and long-term memory.
He noted that this architecture represents a fundamental shift in computing patterns, not just an upgrade of efficiency tools. During the live demonstration, he said, “We’re using Claude Code here, but Codex also performs excellently.”
Three confirmed demonstration cases from GTC Taipei
During the keynote, Jensen Huang demonstrated three AI Agent cases live: first, generating complete application source code directly from natural-language prompts; second, after inputting a text description, the Agent immediately generated a dynamic particle animation featuring the theme “Taipei 101 → GTC Taipei 2026 → NVIDIA logo”; third, he took a live photo of a missing battery clip from a remote control, and the Agent automatically called a CAD tool to generate a replacement parts file that can be used directly for 3D printing.
Refuting the “software shutdown” narrative: “This is the best time to build software companies”
In response to the market claim that “AI Agents will cause software companies to go bankrupt,” Jensen Huang clearly pushed back: “On the contrary.” He said that once they’re no longer limited by the number of humans, massive numbers of Agents will use more software tools than humans do. “This is the best time to build software companies”—but the prerequisite is that software must be designed and presented in a way that Agents can directly call. NVIDIA’s CUDA X libraries are comprehensively open for Agents to use, and Agent usage efficiency can even surpass that of human developers.
FAQ
What fundamental difference exists between the AI Agent defined by Jensen Huang and traditional software applications?
The AI Agent Jensen Huang defined at GTC Taipei 2026 consists of an LLM (reasoning and planning) and an orchestration engine (tool connections + memory management). Traditional software is “start—click—type/input,” while the Agent mode is “describe the intent to AI; AI automatically generates code, calls tools, and outputs results,” shifting the operating主体 from humans to the AI itself.
What does the remote-control battery-clip 3D printing case illustrate?
This live demonstration showcases the Agent’s multi-tool calling capability: the Agent identifies the problem in the photo (missing battery clip), understands the requirement (needing a replacement part), calls a CAD tool to model it, and directly outputs a file ready for 3D printing—completing a full workflow from problem recognition to solution, with no need for step-by-step human intervention.
What does full openness of CUDA X to Agents mean?
Jensen Huang announced that NVIDIA’s CUDA X libraries are now fully open to AI Agents, and that Agents’ usage efficiency exceeds that of human developers. This means NVIDIA’s core AI acceleration infrastructure has officially expanded into the Agent development ecosystem, providing developers with a more efficient foundation for tool calls.