Chips & Hardware · Report

Cerebras unveiled Andromeda supercomputer with proprietary wafer-scale chip architecture delivering 13.5 million cores.

Novel AI chip architecture demonstrates alternative to NVIDIA/AMD duopoly; architectural significance despite limited production scale.

Trade pressSlicast · November 21, 2022 · Global · Source: extremetech.com

importance 68

Cerebras unveiled its new AI supercomputer Andromeda at SC22, featuring 13.5 million cores across 16 Cerebras CS-2 systems. The system boasts an exaflop of AI compute and 120 petaflops of dense compute, powered by Cerebras' wafer-scale, manycore WSE-2 processor. Each WSE-2 wafer contains three physical planes handling arithmetic, memory, and communications. The memory plane alone holds 40GB of onboard SRAM, enough to contain an entire BERT-LARGE model, while the arithmetic plane includes approximately 850,000 independent cores and 3.4 million FPUs with a collective 20 PB/s of internal bandwidth across the communication plane's cartesian mesh. Andromeda receives its data from a bank of 64-core AMD EPYC 3 processors that handle a wide range of data pre- and post-processing operations.

While Frontier, a supercomputer at Oak Ridge National Lab capable of nuclear weapons simulations, passed the exaflop mark earlier this year and operates at 64-bit precision compared to Andromeda's 16-bit half precision, the two systems serve different purposes. Frontier cost $600 million to build, whereas Andromeda costs less than $35 million. Cerebras founder and CEO Andrew Feldman stated, "AMD EPYC is the best choice for this type of cluster because it offers unparalleled core density, memory capacity and IO. This made it the obvious choice to feed data to the Andromeda supercomputer." Similarly, Andromeda is not designed to replace Polaris, the cluster of more than two thousand Nvidia A100 GPUs at Argonne National Lab, which itself uses AMD EPYC cores for pre- and post-processing. Rather, each supercomputer excels at different types of work.

Andromeda specializes in handling large sparse matrices—multidimensional arrays of tensor data that are mostly zeroes—a capability where it shows particular strength with large language models. Customers including AstraZeneca and GlaxoSmithKline have reported success using LLMs on Andromeda for "omics" research, including work on the COVID genome and epigenome. During experiments at the National Energy Technology Lab, scientists completed "GPU impossible" work that Polaris simply could not handle. The system is also deployed in fusion research, and currently resides at Colovore, a high-performance computing data center in Santa Clara.

Cerebras has allocated free access to Andromeda for academics and graduate students, and the system integrates seamlessly with Python and Jupyter notebooks. Mateo Espinosa, a doctoral candidate at the University of Cambridge who formerly worked at Cerebras, stated: "It is extraordinary that Cerebras provided graduate students with free access to a cluster this big. Andromeda delivers 13.5 million AI cores and near-perfect linear scaling across the largest language models, without the pain of distributed compute and parallel programming. This is every ML graduate student's dream."

As machine learning contends with ever-growing data volumes, latency within and between networks increasingly becomes a constraint. Higher throughput demands drive up energy consumption and create bottlenecks that cannot be solved simply by adding more hardware. It is at this convergence point that Cerebras positions Andromeda to make its mark.

Read the original