Plans to build ‘exascale’ machines are moving forward, but still face major technological challenges.
By Katherine Bourzac
29 November 2017
At the end of July, workers at the Oak Ridge National Laboratory in Tennessee began filling up a cavernous room with the makings of a computational behemoth: row upon row of neatly stacked computing units, some 290 kilometres of fibre-optic cable and a cooling system capable of carrying a swimming pool’s worth of water. The US Department of Energy (DOE) expects that when this US$280-million machine, called Summit, becomes ready next year, it will enable the United States to regain a title it hasn’t held since 2012 — home of the fastest supercomputer in the world.
Summit is designed to run at a peak speed of 200 petaflops, able to crunch through as many as 200 million billion ‘floating-point operations’ — a type of computational arithmetic — every second. That could make Summit 60% faster than the current world-record holder, in China.
But for many computer scientists, Summit’s completion is merely one lap of a much longer race. Around the world, teams of engineers and scientists are aiming for the next leap in processing ability: ‘exascale’ computers, capable of running at a staggering 1,000 or more petaflops. Already, four national or international teams, working with the computing industries in their regions, are pushing towards this ambitious target. China plans to have its first exascale machine running by 2020. The United States, through the DOE’s Exascale Computing Project, aims to build at least one by 2021. And the European Union and Japan are expected to be close behind.
Scientists anticipate that exascale computers will enable them to solve currently intractable problems in fields as varied as climate science, renewable energy, genomics, geophysics and artificial intelligence. That could include pairing detailed models of fuel chemistry and combustion engines in order to more quickly identify improvements that could lower greenhouse-gas emissions. Or it might allow for simulations of the global climate at a spatial resolution as high as a single kilometre. With the right software in hand, “there will be a lot of science we can then do that we can’t do now”, says Ann Almgren, a computational scientist at the Lawrence Berkeley National Laboratory in California.
But reaching the exascale regime is a tremendous technological challenge. The exponential increases in computing performance and energy efficiency that once accompanied Moore’s law are no longer guaranteed, and aggressive changes to supercomputer components are needed to keep making gains. Moreover, a supercomputer that performs well on a speed test is not necessarily one that will excel at scientific applications.
The effort to push high-performance computing to the next level is forcing a transformation in how supercomputers are designed and their performance measured. “This is one of the hardest problems I’ve seen in my career,” says Thomas Brettin, a computer scientist at the Argonne National Laboratory in Illinois, who is working on medical software for exascale machines.
Broader trends in the computing industry are shaping the path to exascale computers. For more than a decade, transistors have been so tightly packed that computing chips can’t be made to run at faster rates. To circumvent this, today’s supercomputers lean heavily on parallelism, using banks of chips to create machines with millions of processing units called ‘cores’. A supercomputer can be made more powerful by stringing together more of these chips.
But as these machines get bigger, data management becomes more of a challenge. Moving data in and out of storage, and even within cores, takes much more energy than the calculations themselves. By some estimates, as much as 90% of the power supplied to a high-performance computer is used for data transport.
That has led to some alarming predictions. In 2008, in a report for the US Defense Advanced Research Projects Agency, a team headed by computer scientist Peter Kogge concluded that an exascale computer built from foreseeable technologies would need gigawatts of power — perhaps from a dedicated nuclear plant (see go.nature.com/2hs3x6d). “Power is the number one, two, three and four problem with exascale computing,” says Kogge, a professor at the University of Notre Dame in Indiana.
In 2015, in light of technological improvements, Kogge reduced this estimate down to between 180 and 425 megawatts. But that is still substantially more power than today’s top supercomputers use; the system that leads the world rankings today — China’s Sunway TaihuLight — consumes about 15 megawatts.
Read the entire article on Nature.com.