Wall Street Reviews GTC: According to Nvidia's definition, computing power is revenue, and tokens are the new commodity.

2026-03-17 12:16:14

NVIDIA’s annual GTC conference sends a core message: the business logic of AI computing power is undergoing a fundamental restructuring—Tokens have become the new commodities, and computing power equals revenue.

At this GTC, NVIDIA management significantly raised the visibility of data center sales from the previous $500 billion (covering until 2026) to over $1 trillion (covering 2025 to 2027 cumulative), and explicitly stated that sales of the independent Vera CPU and LPX rack solutions will be counted separately outside this figure. Wall Street views this conference as strong support for NVIDIA’s ongoing AI cycle.

According to ChaseTrade, the latest JPMorgan report indicates that this figure implies at least $50 to $70 billion of upside potential compared to current Wall Street consensus for data center revenue in 2026-2027.

Bank of America directly quotes NVIDIA management—“Token is the new commodity, compute power equals revenue”—and notes that the Blackwell system has reduced the cost per Token by up to 35 times compared to the previous Hopper generation, with the upcoming Rubin series expected to further reduce costs by 2 to 35 times, depending on workload type and architecture configuration.

Within NVIDIA’s narrative framework, this continuous compression of Token costs is the fundamental driver for demand expansion.

Visibility of demand doubles, driven by both large-scale cloud customers and enterprise markets

NVIDIA management disclosed that high-confidence purchase orders for the Blackwell and Vera Rubin systems have exceeded $1 trillion, doubling from the $500 billion announced at the October 2025 GTC Data Center Conference. They also stated that additional orders and backlogs for 2027 are expected to continue accumulating over the next 6 to 9 months.

Demand structure is diversified: about 60% from hyperscale cloud providers (internal AI workloads shifting from recommendation/search to large language models), with the remaining 40% from CUDA-native AI enterprises, NVIDIA cloud partners, sovereign AI, and industrial/enterprise clients.

BofA notes that this new $1 trillion outlook aligns closely with Wall Street’s previous expectation of approximately $970 billion for data center revenue over this three-year period, confirming the logic similar to the October 2025 projection of $450 billion based on the previous $500 billion outlook.

It’s noteworthy that NVIDIA management dedicated considerable discussion at this conference to accelerating traditional enterprise workloads.

NVIDIA announced collaborations with IBM (accelerating WatsonX), Google Cloud (BigQuery acceleration, ~76% cost savings with Snap), Dell (AI data platform), and launched two major CUDA-X libraries, cuDF and cuVS.

JPMorgan believes this direction is “seriously underestimated by the market”—the logic being that Moore’s Law has become less effective, and domain-specific acceleration is the only viable alternative, expanding NVIDIA’s addressable market beyond AI training/inference cycles.

Groq LPU integration: the most important new product release at the architecture level

JPMorgan regards the integration of Groq 3 LPU with Vera Rubin as the “most important new product release at the architecture level” at this GTC.

This decoupled inference architecture pairs the Rubin GPU (high throughput, 288GB HBM4, 22TB/s bandwidth, 50 PFLOPS NVFP4) with the Groq LPU (low latency decoding, 500MB on-chip SRAM, 150TB/s SRAM bandwidth, 1.2 PFLOPS FP8): pre-filling is done on Vera Rubin, with attention decoding also running on Rubin, while feedforward networks/token generation are offloaded to Groq LPU.

The LPX rack integrates 256 LPUs, providing 128GB aggregated SRAM, 40PB/s memory bandwidth, and 315 PFLOPS inference power, expected to launch in Q3 2026.

NVIDIA management stated that workloads requiring ultra-high Token speeds (code generation, engineering calculations, long-context inference) will allocate about 25% of data center power to LPX, with the remaining 75% dedicated to pure Vera Rubin NVL72 configurations.

BofA data shows that “Rubin systems combined with SRAM LPX racks can improve efficiency for high-end low-latency workloads by up to 35 times compared to the previous generation.” JPMorgan points out that this architecture directly addresses the fundamental contradiction where a single processor cannot optimize throughput (limited by FLOPS) and latency (limited by bandwidth) simultaneously, enabling NVIDIA to effectively compete in the high-end inference market traditionally dominated by ASIC vendors.

Parallel advancement of copper cables and CPO, no single betting route

NVIDIA management directly addressed the copper cable versus co-packaged optics (CPO) debate at the conference, confirming that both routes will be pursued simultaneously.

In the current Vera Rubin generation, Oberon racks extend with copper cables to NVL72, and optical extensions reach NVL576; Spectrum-6 SPX co-packaged optical Ethernet switches are mass-produced, jointly developed by NVIDIA and TSMC, with management claiming optical power efficiency is 5 times better and resilience 10 times higher than traditional pluggable transceivers.

For Rubin Ultra (H2 2027), Kyber racks will use copper NVLink extensions (up to 144 GPUs), with CPO-based NVLink exchange as an alternative. Feynman (2028) will explicitly support both copper and CPO extensions, equipped with Spectrum-7 (204T, CPO) for lateral expansion.

BofA emphasizes that adoption of CPO-based switches for lateral expansion is optional for customers—they can continue using copper cables until they see fit. JPMorgan agrees, expecting copper extension to dominate NVL72/NVL144 configurations until at least 2027, with CPO gradually increasing share in lateral expansion and NVL576+ configurations.

Vera CPU: a new multi-billion-dollar revenue stream targeting intelligent agents

NVIDIA management clearly stated that the Vera CPU standalone business “has already been confirmed to become a multi-billion-dollar business,” and BofA notes this revenue stream is not yet reflected in current market consensus, representing an incremental contribution.

Vera CPU features 88 custom Olympus ARM cores, LPDDR5X memory subsystem providing 1.2TB/s bandwidth (half the power consumption of traditional server CPUs), and connects to GPUs via NVLink-C2C at 1.8TB/s (7 times PCIe Gen 6). Vera CPU racks integrate 256 liquid-cooled CPUs supporting over 22,500 concurrent CPU environments.

Management emphasized that CPUs are becoming a bottleneck for AI agent expansion—reinforcement learning and agent workflows require large CPU environments to test and verify GPU model outputs. Meta has deployed the previous generation Grace CPU at scale, with Vera expected to succeed in 2027.

JPMorgan characterizes this CPU revenue as high-margin, repeatable (deployed alongside GPU racks in AI factories), and structurally linked to NVIDIA’s actively catalyzed AI agent curve.

Product roadmap extended to 2028, with continuous reinforcement of annual architecture rhythm

NVIDIA reaffirmed its annual platform release cadence: Blackwell (2024) → Blackwell Ultra (2025) → Rubin (2026) → Rubin Ultra (2027) → Feynman (2028).

Rubin Ultra will feature a 4-chip GPU configuration with 1TB HBM4e, introduce the LP35 LPU chip (first to include NVFP4 compute), and Kyber racks will support up to 144 GPUs per NVLink domain (7th gen NVLink, 3.6Tb/s per GPU, 1.5Pb/s aggregate bandwidth for NVL576).

Details of Feynman exceed market expectations:

New GPUs will use TSMC A16 (1.6nm) process, with chip stacking and custom HBM; new CPU named Rosa (after Rosalind Franklin), designed for orchestrating intelligent agent workloads across GPUs, LPUs, storage, and networking; new LPU named LP40, jointly developed by NVIDIA’s Groq team; also includes BlueField-5 DPU, ConnectX-10 super network card, NVLink 8, and Spectrum-7 (204T, CPO).

JPMorgan believes NVIDIA’s vertically integrated platform (spanning seven chips, five rack systems, and supporting software stack) is difficult to replicate, and the accelerating inference demand, along with the structural expansion of addressable markets driven by traditional workload acceleration and expanding customer base, supports a more sustained AI capital expenditure cycle than current market expectations.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.