Research Reports

2026 Trends: Memory for New AI Inference Demand

icon

Last Modified

2026-06-12

icon

Update Frequency

Not

icon

Format

PDF


Contact Us

In January 2026, NVIDIA introduced the CMX Context Memory Storage Platform, managed by the BlueField‑4 DPU, to extend the memory hierarchy between local SSD and shared storage and address the massive KV cache storage demands of the AI inference era. In addition, NVIDIA and Arm have successively launched CPU racks to meet the CPU requirements of agentic AI, creating an incremental market for CPU RAM.

This report provides an in‑depth analysis of: (1) memory demand in AI inference; (2) SSD POD demand driven by KV cache offloading; and (3) CPU RAM demand driven by agentic AI. The goal is to explain why memory capacity needs are expanding in the AI inference era, review current solutions, and outline the future structure of emerging memory demand.

Key Highlights

  • NVIDIA introduced the CMX platform managed by BlueField‑4 DPU to extend between local SSD and shared storage for large KV cache in AI inference. 
  • NVIDIA and Arm launched CPU racks to meet agentic AI CPU needs, creating incremental CPU memory demand.
  • The report focuses on AI inference memory needs, SSD POD demand from KV cache offload, and CPU memory structure changes driven by agentic AI. 

Table of Contents

  1. Memory Demand in AI Inference
    • Figure 1: AI Models Average Output Tokens per Question (2023-2026)
    • Figure 2: Example of KV Cache Applications
    • Figure 3: Changes to CPU:GPU Ratio among Agentic AI Applications
  2. SSD POD Demand Driven by KV Cache Offloading
    • Figure 4: Sequence of KV Cache Offloading for NVIDIA’s Dynamo (G1-G4)
  3. CPU Demand Driven by Agentic AI
    • Figure 5: NVIDIA’s Vera CPU Architecture
    • Table 1: CPU Specifications of Various Suppliers (2023-2026)
    • Table 2: Analysis on Hypothetical Shipment Scenario of NVIDIA’s CPUs in 2026
    • Figure 6: Analysis Results on Demand Scenario of NVIDIA’s CPUs in 2026
    • Table 3: Summary of Memory Demand Drivers Introduced by AI Inference
  4. TRI’s View

<Total Pages: 11>

Summary of Memory Demand Drivers Introduced by AI Inference





USD

2,500

icon

Membership

Get in touch with us


Get in touch with us