Microsoft develops new cooling technology to address thermal bottlenecks as hyperscalers hit power ceiling constraints.
Microsoft has announced a new cooling technology for AI chips using microfluidics to channel liquid directly inside silicon chips. Tiny channels are etched directly on the back of the silicon chip, creating grooves that allow cooling liquid to flow directly onto the chip and more efficiently remove heat. Microsoft's AI-based system identifies unique heat signatures on a chip and directs the coolant with precision. The company validated the design by cooling a server running simulated Teams meetings. Lab-scale tests showed that microfluidics performed up to three times better than cold plates at removing heat and reduced the maximum temperature rise of the silicon inside a GPU by 65%, depending on the type of chip. The team expects the technology to improve power usage effectiveness and reduce operational costs. For prototyping, Microsoft has partnered with Swiss startup Corintis to use AI to optimize a bio-inspired design to cool chips' hot spots more efficiently than traditional straight up-and-down channels.
AI workloads and high-performance computing have placed unprecedented strain on data center infrastructure, with thermal dissipation emerging as one of the toughest bottlenecks. Traditional methods such as airflow and cold plates are increasingly unable to keep pace with new generations of silicon. According to Sanchit Vir Gogia, CEO and chief analyst at Greyhound Research, "Modern accelerators are throwing out thermal loads that air systems simply cannot contain, and even advanced water loops are straining. The immediate issues are not only the soaring TDP of GPUs, but also grid delays, water scarcity, and the inability of legacy air-cooled halls to absorb racks running at 80 or 100 kilowatts." He noted that "Cold plates and immersion tanks have extended the runway, but only marginally. They still suffer from the resistance of thermal interfaces that smother heat at the die. The friction lies in the last metre of the thermal path, between junction and package, and that is where performance is being squandered."
The thermal challenge carries significant economic weight. According to Danish Faruqui, CEO at Fab Economics, as per 2025 AI infra buildouts TCO analysis, over 45%-47% of data center power budget typically goes into cooling, which could expand to 65%-70% without advancement in cooling method efficiency. GPU power requirements have escalated dramatically: in 2024, Nvidia's Hopper H100 required 700 watts per GPU, scaling in 2025 to 1000 watts for the Blackwell B200 and 1400 watts for the Blackwell Ultra B300. Going forward in 2026, Rubin GPUs are projected to require 1800 watts and Rubin Ultra GPUs 3600 watts. Faruqui noted that microfluidics-based direct-to-silicon cooling could limit cooling expense to less than 20% within the data center power budget and that microfluidic cooling could be the sole enabler for Rubin Ultra GPU TDP budget of 3.6kW per GPU, though this would require significant technology development optimization around microfluidics structure size, placement and non-laminar flow analysis in micro channels.
The challenge is universal, with hyperscalers including AWS, Google, Meta, and Oracle all grappling with extreme chip heat. Brady Wang, associate director at Counterpoint Research, stated that "The escalating thermal load from new generations of AI silicon means that relying on today's solutions, such as cold plates, could impose a 'hard ceiling on progress' within as little as five years." However, implementation obstacles remain formidable. According to Manish Rawat, analyst at TechInsights, "Fabricating micron-scale channels increases process complexity and may raise yield loss due to wafer fragility. Ultra reliable sealing is critical, as even minor leaks or particulate contamination could degrade chip performance. Unlike replaceable cold plates, silicon integrated cooling makes chip replacement the only maintenance option, escalating service costs and logistical complexity. Additionally, long-term exposure to coolant, even dielectric, can induce chemical and mechanical stress, necessitating extensive qualification to ensure 5–10 year reliability." For microfluidics to take root, the approach demands careful management of fabrication, reliability, and maintenance risks and must become standard practice across the ecosystem.