Friday, June 26, 2026
EN·DarkSubscribe
AI Infrastructure · News & Analysis
HomeChips & HardwareReport
Chips & Hardware · Report

NVIDIA unveiled the GB200 NVL4 with four Blackwell GPUs, dual Grace CPUs, 1.7TB memory, and 5.4kW power envelope.

Expands dense GPU compute capacity for enterprise AI infrastructure with advanced memory and power efficiency.
Trade pressSlicast · November 19, 2024 · Global · Source: tweaktown.com
importance 90

NVIDIA has announced its new GB200 NVL4, a module that represents a significant expansion of the original GB200 Grace Blackwell Superchip AI solution. The new GB200 NVL4 features 2 x Blackwell GB200 GPUs configured onto a larger board with 2 x Grace CPUs, with the module designed as a single-server solution with 4-way NVLink domain controlling a pool of 1.3TB of coherent memory.

The GB200 NVL4 delivers substantial performance improvements over the previous-generation Hopper GH200 NVL4, offering 2.2x performance in simulation and a 1.8x increase in Training and Inference performance. While the individual GB200 Grace Blackwell Superchip uses around 2700W of power, the larger GB200 NVL4 solution requires approximately 6000W of power to support its expanded capabilities.

The new GB200 NVL4 features 1.3TB of coherent memory shared across all four GB200 GPUs over NVLink. NVIDIA notes that its fifth generation NVLink chip-to-chip interconnect enables high-speed communication between the CPUs and GPUs of up to 1.8TB/sec of bidirectional throughput per GPU.

In addition to the GB200 NVL4 announcement, NVIDIA unveiled an extension to its Hopper product stack with the introduction of the H200 NVL, a PCIe-based Hopper card solution. The H200 NVL can connect up to 4 GPUs through an NVLink domain, offering 7x the bandwidth of a standard PCIe solution and providing flexible server configurations optimized for hybrid HPC and AI workloads.

The H200 NVL specifications include 1.5x more HBM memory and 1.7x the LLM inference performance compared to previous generations, along with 1.3x the HPC performance. The card features 114 SMs in total with 14,592 CUDA cores, 456 Tensor Cores, and up to 3 TFLOPs of FP8 (FP16 accumulated) performance. It includes 80GB of HBM2e memory on a 5120-bit memory interface with up to 350W TDP.

Read the original
NVIDIA unveiled the GB200 NVL4 with four… · Slicast