The installation of Aurora’s 10,624th and final “blade” marked a major milestone for the highly anticipated “exascale” supercomputer at Argonne National Laboratory.
After years of diligent work and planning, the system now contains all the hardware that will make it one of the most powerful supercomputers in the world when it is opened up for scientific research. Built by Intel and Hewlett Packard Enterprise, Aurora will be theoretically capable of delivering more than two exaflops of computing power, or more than 2 billion billion calculations per second.
These supercomputers are invaluable to scientists.
“Everything we know about large-scale climate comes from climate simulations on supercomputers. What we know about the human genome comes from massive data analysis on big computers. Everything that’s happening in AI right now is happening on large-scale computers,” Rick Stevens, who helped lead the effort and is a professor with the University of Chicago and associate director at Argonne, told Chicago Magazine. “Our ability to design reactors, our ability to come up with new batteries — all that is a result of computing.”
The Aurora team has been building the system piece by piece over the last year and a half, installing blades and other components as they were delivered to Argonne, which is a U.S. Department of Energy national laboratory affiliated with the University of Chicago.
“We have been living and breathing the Aurora installation since the first pieces were delivered in November of 2021,” said Susan Coghlan, project director for Aurora. “While we still have a lot of work to do before we can roll the system out to scientists worldwide, it is incredibly exciting to have the final hardware in place.”
Blades are backbone of system
As the backbone of the system, Aurora’s blades are sleek rectangular units that house its processors, memory, networking and cooling technologies. The machine gets its computational muscle from a combination of state-of-the-art Intel CPUs (central processing units) and GPUs (graphics processing units). Each blade is equipped with two Intel Xeon CPU Max Series processors and six Intel Data Center GPU Max Series processors.
With each blade weighing in at around 70 pounds, the team needed a specialized machine to delicately install the units vertically into Aurora’s refrigerator-sized racks. Each of the system’s 166 racks contains 64 blades. The racks are spread out across eight rows, occupying the space of two professional basketball courts in the ALCF data center.
Expanded space to stretch out
Before the system could be installed, Argonne had to carry out some major facility upgrades. This included adding new data center space to provide enough room for the supercomputer and building mechanical rooms and equipment to provide increased power and cooling capacity.
Now that the machine is fully assembled, researchers will move their work to Aurora to begin scaling their applications on the full system. For the past few months they’ve been working on the Sunspot testbed, which is a test and development system that has the exact same architecture as Aurora but only on two racks. These early users help to stress-test the supercomputer and identify potential bugs that need to be resolved ahead of its deployment.
“We’re looking forward to putting Aurora through its paces to make sure everything works as intended before we turn the system over to the broader scientific community,” Coghlan said.
—Adapted from an article by Jim Collins first published by Argonne National Laboratory.