Friday, June 26, 2026
EN·DarkSubscribe
AI Infrastructure · News & Analysis
HomeChips & HardwareReport
Chips & Hardware · Report

NVIDIA reportedly mulls socketed modular design for next-generation Blackwell B300 GPUs to enable user-replaceable compute modules.

Reduces hyperscaler capex risk and hardware obsolescence timelines; improves datacenter upgrade flexibility without full system replacement.
Trade pressSlicast · October 12, 2024 · Global · Source: tomshardware.com
importance 60

Nvidia is considering adopting a socket design for at least some of its upcoming Blackwell B300 GPUs for AI and HPC applications, according to a report from TrendForce that cites Economic Daily News and MoneyDJ. The company is said to adopt the new socketed design for something codenamed GB300. Chen Shuowen, an analyst with CLSA, stated based on supply chain checks that "Nvidia has been designing GPU sockets for its products, possibly starting with the GB200 Ultra," and mentioned a 4-way Nvidia GPU design with one Nvidia CPU.

The impetus for considering socket designs stems from practical challenges in current GPU manufacturing. MoneyDJ reports that considering the failure rates of AI GPUs under high loads, the replacement costs of motherboards, and cooling challenges, Nvidia and other AI GPU designers might consider using socket designs for their next generation of GPUs instead of soldering GPUs to motherboards. However, technical experts note significant drawbacks: socketed designs would add to power and cooling challenges rather than help solve them, as the most power-hungry GPUs usually use BGA packaging. Additionally, a 4-way Blackwell GPU with one CPU motherboard does not look extraordinary when compared to existing DGX servers, which feature an 8-way GPU baseboard and a 2-way CPU motherboard.

Currently, Nvidia's Blackwell-based data center offerings use different form factors. The B200 GPU (1,000W+) is used on GB200 boards codenamed Bianca with one Grace CPU and two Blackwell GPUs, as well as Ariel with one Ariel CPU and one Blackwell GPU, all in BGA form-factor. Umbriel GPU boards support eight B200 (1000W) and B100 (700W) SXM module form factors, while additional platforms codenamed Miranda and Oberon GB200 also exist. Based on unofficial information, Nvidia is also preparing a codenamed B200A product based on the monolithic B102 processor with four HBM3E memory stacks connected using TSMC's CoWoS-S packaging technology, which could potentially adopt multiple form factors.

While socket designs offer familiarity and ease of repair compared to traditional CPU sockets, they present significant trade-offs in server environments. SXM and OAM modules provide reparability but require careful handling, and add-in cards, SXM, and OAM modules are hard and expensive to make—most Nvidia SXM modules are currently made by Foxconn. Migrating from a card or module to a socket cuts costs but limits performance. Intel's experience with its socketed Xeon CPU Max 9480 'Sapphire Rapids' with HBM onboard was not a success beyond selected supercomputing applications, suggesting that Nvidia's potential adoption of socketed GPU designs may face similar challenges.

Read the original
NVIDIA reportedly mulls socketed modular… · Slicast