NVIDIA publicly demonstrated its next-generation Vera Rubin superchip combining two massive GPUs, with production ramping next year.
At GTC October 2025, NVIDIA's CEO Jensen Huang unveiled the next-generation Vera Rubin Superchip, marking the first public demonstration of an actual sample of the motherboard featuring the Vera CPU alongside two Rubin GPUs. The motherboard is equipped with 32 sites of LPDDR system memory, which will be combined with HBM4 memory integrated on the Rubin GPUs. According to Jensen, the Rubin GPUs are back in the labs, representing the first samples produced at TSMC in Taiwan. Each GPU is surrounded by extensive power circuitry and features 8 HBM4 sites with two Reticle-sized GPU dies, while the Vera CPU contains 88 custom ARM cores with 176 threads.
The NVIDIA Vera Rubin NVL144 platform will utilize the Rubin GPU with two Reticle-sized chips offering up to 50 PFLOPs of FP4 performance and 288 GB of HBM4 memory. These are paired with an 88-core Vera CPU featuring a custom ARM architecture, 176 threads, and up to 1.8 TB/s of NVLINK-C2C interconnect. The platform is expected to enter mass production around Q3 or Q4 2026, with Jensen indicating this could occur "around the same time next year or earlier." This timeline comes as NVIDIA's Blackwell Ultra GB300 Superchip platforms continue rolling out at full speed.
The NVL144 platform will deliver 3.6 Exaflops of FP4 inference and 1.2 Exaflops of FP8 Training capabilities—a 3.3x increase over GB300 NVL72—along with 13 TB/s of HBM4 memory with 75 TB of fast memory, representing a 60% uplift over GB300. The platform also features 2x the NVLINK and CX9 capabilities, rated at up to 260 TB/s and 28.8 TB/s, respectively.
A second platform, Rubin Ultra, will arrive in the second half of 2027, scaling the NVL system from 144 to 576 units while maintaining the same CPU architecture. The Rubin Ultra GPU will feature four Reticle-sized chips, delivering up to 100 PFLOPs of FP4 and 1 TB total HBM4e capacity across 16 HBM sites. The NVL576 platform will achieve 15 Exaflops of FP4 inference and 5 Exaflops of FP8 Training capabilities—a 14x increase over GB300 NVL72—with 4.6 PB/s of HBM4 memory, 365 TB of fast memory representing an 8x uplift over GB300, and 12x the NVLINK and 8x the CX9 capabilities at up to 1.5 PB/s and 115.2 TB/s, respectively.