Headlines · Report

AWS is launching AI Factories (managed compute services) alongside new chip introductions including Trainium3 and Nvidia GB300.

AWS's vertical integration into chips and managed services intensifies competition and reshapes infrastructure standardization across the cloud stack.

Trade pressSlicast · December 2, 2025 · Global · Source: siliconangle.com

importance 85

Amazon Web Services Inc. announced a comprehensive set of artificial intelligence infrastructure offerings aimed at dominating both cloud and private AI at large scale. The announcements included the launch of AWS AI Factories, the general availability of Amazon EC2 Trn3 UltraServers powered by the new Trainium3 chip, and the introduction of P6e-GB300 UltraServers featuring Nvidia's latest Blackwell-based GB300 NVL72 platform. These offerings span sovereign on-premises deployments, next-generation custom AI accelerators, and the most advanced Nvidia Corp. GPU instances yet offered on AWS.

AWS AI Factories deliver dedicated, full-stack AWS AI infrastructure directly inside customers' existing data centers, combining Nvidia accelerated computing, AWS Trainium chips, high-speed low-latency networking, energy-efficient infrastructure and core AWS AI services, including Amazon Bedrock and Amazon SageMaker. Built primarily for governments and regulated industries, the platform operates similarly to a private AWS Region to provide secure, low-latency access to compute, storage and AI services while ensuring strict data sovereignty and regulatory compliance. Customers can leverage their own facilities, power and network connectivity, while AWS handles deployment, operations and lifecycle management, with the offering accelerating deployment timelines that would normally take years. AWS highlighted its deepening partnership with Nvidia around the platform, including support for Grace Blackwell and future Vera Rubin GPU architectures and future support for Nvidia NVLink Fusion interconnects in Trainium4. Ian Buck, vice president and general manager of Hyperscale and HPC at Nvidia, stated: "Large-scale AI requires a full-stack approach — from advanced GPUs and networking to software and services that optimize every layer of the data center. Together with AWS, we're delivering all of this directly into customers' environments."

Amazon EC2 Trn3 UltraServers, powered by the new three-nanometer Trainium3 AI chip, are now generally available. Trn3 systems can scale up to 144 Trainium3 chips in a single UltraServer to deliver up to 4.4 times more compute performance, four times greater energy efficiency and nearly four times more memory bandwidth than Trainium2. The UltraServers are designed for next-generation workloads such as agentic AI, mixture-of-experts models and large-scale reinforcement learning, with AWS-engineered networking that delivers sub-10-microsecond chip-to-chip latency. In testing using OpenAI Group PBC's open-weight model GPT-OSS, AWS customers achieved three times higher throughput per chip and four times faster inference response times versus the previous generation. Customers including Anthropic PBC, Karakuri Ltd., Metagenomi Inc., Neto.ai Inc., Ricoh Company Ltd. and Splash Music Inc. are already reporting up to 50% reductions in training and inference costs. AWS also previewed Trainium4, which is expected to deliver major gains in FP4 and FP8 performance and memory bandwidth.

AWS introduced the new P6e-GB300 UltraServers, featuring Nvidia's GB300 NVL72 platform, making it the most advanced Nvidia GPU architecture available in Amazon EC2. The instances deliver the highest GPU memory and compute density on AWS, targeting trillion-parameter AI inference and advanced reasoning models in production. The P6e-GB300 systems run on the AWS Nitro System and integrate tightly with services such as Amazon Elastic Kubernetes Service, allowing customers to deploy large-scale inference workloads securely and efficiently.

Read the original