How the Energy Department is Working to Engage Americans on Exascale Supercomputing
Connecting state and local government leaders
Officials also offered updates on how the buildout of U.S. exascale machines is unfolding.
U.S. national labs and their private sector supercomputing partners this week steered a robust, online visibility campaign in hopes of fostering a greater sense of national pride in the United States’ exascale computing technology efforts.
Throughout what’s been deemed ‘Exascale Awareness Week,’ the Energy Department’s Exascale Computing Project or ECP, Lawrence Livermore, Argonne and Oak Ridge National Labs, AMD, HPE, and other labs and businesses are circulating a range of digital content underscoring the potential capabilities exascale systems will have to offer. The many moves to help educate people on the next-level computing power are all leading up to Sunday, Oct. 18: Exascale Day, 2020.
That date echoes this next generation of supercomputers’ capacity to process more than one quintillion calculations per second.
“Exascale Day will happen on 10/18 every year because a quintillion is 10^18th power, or a one with eighteen zeroes trailing behind it,” Lawrence Livermore National Lab Public Information Officer Jeremy Thomas explained Wednesday. “The whole event is really meant to set aside some time to talk about exascale, the exciting things that will be done with these machines that are just on the horizon, and how they will impact nearly all sectors of society.”
He and another official close to the government’s exascale projects provided Nextgov with a deep look into the event’s impetus and what’s unfolded so far, helping visualize the coming ‘exascale era’ of computing—and offered a fresh update on America’s in-the-making exascale machines.
Envisioning the Promise of Exascale
Three U.S. exascale systems are under construction and set to start operations over the next few years. First announced in March 2019, Aurora will be based at Argonne National Lab. Frontier, unveiled two months later, is set to be housed at Oak Ridge National Lab and Lawrence Livermore in August of that year revealed it’s developing El Capitan, an advanced system to support the U.S. nuclear stockpile. The three contracts each span values from $500 million to more than $600 million.
Many are generally familiar with terms like ‘megabytes’ and ‘gigabytes’ to refer to units of data storage. “Well, in advanced computer performance such as we have with supercomputers, we refer to the ‘scale’ of a system’s capabilities—mega led to giga, then terascale, and we’re now at petascale,” Thomas explained.
And the next level is exascale.
“Exascale is a full 10x order of magnitude more powerful than today’s fastest supercomputers. To put this in perspective, a supercomputer running at one exaflop [or one quintillion computing operations per second] is about a million times more powerful than your typical laptop,” Thomas said. “At more than 2 exaflops, LLNL’s El Capitan will be faster than the top 100 supercomputers that currently exist in the world—combined.”
Recognizing that a quintillion is a number so massive it’s difficult for people to wrap their heads around, he offered analogies to help express that level of power. “It would take 40,000 years for one quintillion gallons of water to spill over Niagara Falls,” Thomas noted, adding that if the Earth’s entire population of 7.7 billion humans each individually completed one calculation per second, it would take them more than eight years to do what El Capitan is anticipated to complete in one second.
“This unprecedented level of computing capability will have profound impacts in national security, AI, energy, materials science, space and astrophysics, medicine and drug discovery, transportation, climate modeling—you name it, basically any scientific discipline or industry that requires faster, more detailed computer modeling and simulation will benefit from these unprecedented tools, enabling scientific discoveries that have never even been imagined before,” Thomas said.
And through Exascale Awareness Week, he and key supercomputing officials aim to drive that point home to the broader American public.
The Exascale Computing Project is a collaborative Energy Department-led program to accelerate the U.S. exascale ecosystem and produce software, tools and applications for those in-development supercomputers. Over the course of this week, its landing page has been continuously populated with videos, audio discussions, op-eds, quotes from researchers and articles presenting all that the department’s labs are doing and plan to do with exascale systems. Topics encompass the next-generation computing’s potential impacts on national security, drug discovery and Covid-19 research, AI and machine learning, climate modeling and heaps of more applications. National labs’ and companies’ own websites and social media pages also disseminated their own materials on the topic.
“Because we can’t host any events in-person right now, we’ve focused our celebration online and with social media,” ORNL’s Associate Laboratory Director for Computing and Computational Sciences Jeff Nichols told Nextgov Thursday.
And according to LLNL’s Thomas, the ultimate aim is to “capture how exascale computing will impact a wide swath of science and industry.” He noted that the initial celebration of Exascale Day came last year, when supercomputer manufacturer Cray—which is a critical partner in all three of Energy’s exascale efforts and is now an HPE company—as well as the labs involved and another partner, chipmaker AMD, collectively realized they “really needed to educate the broader general public on what this next generation of supercomputers ... will mean for them,” he said, adding “and what better way than making a day of it!”
Because the date occurs on a Sunday this year, those involved decided to make it all a little bigger by pivoting it to a full week-long effort, and soliciting more and varied types of content from a larger pool of participants. They all really aimed to make “the whole presentation and experience more engaging,” Thomas noted.
ECP’s Mike Bernhardt spearheaded the charge and representatives from the core group of HPE, AMD, LLNL, ANL, and ORNL began weekly planning meetings over Zoom in August, Thomas said. Other labs and companies—including LANL, Lawrence Berkeley, NVIDIA, and Intel—were also involved along the way.
“The best case scenario for Exascale Awareness Week is that collectively the key players in exascale, including the researchers, engineers and software developers that will work on these machines, are able to tell the story of why exascale computing matters to them, and the nation, in a way that the general public can understand,” Thomas said. “We hope they get as excited about the possibilities these machines offer as we are at the national labs and at the companies that are building them.”
America’s Exascale Pursuits
The celebratory week also presents a prime opportunity for the Energy Department to share details on how the rollout of America’s three exascale systems is going—and the agency’s Under Secretary for Science Paul Dabbar recently provided exactly that.
During an interview with Inside HPC, Dabbar revealed that though Argonne’s Aurora was originally slated to be America’s first in-operation exascale machine, Oak Ridge’s Frontier is now set to beat it to the punch. Dabbar did not go into a great amount of detail about the holdup, but expressed confidence that the department would have at least one exascale machine up-and-running next year—as originally planned.
Argonne officials were unable to chat with Nextgov by deadline, but the core issue is reportedly possibly related to the delayed delivery of an Intel-made graphics processing unit integral to Aurora, which the company made clear earlier this year. Still, amid the exascale week activities, Argonne’s Associate Laboratory Director Rick Stevens and Intel’s Vice President and General Manager of High Performance Computing Trish Damkroger participated in an online conversation on Aurora’s potential.
“What we're most excited about is the fact that this single system will be a world class system for doing simulation and a world class system for doing [artificial intelligence],” Stevens said. “That allows us a lot of flexibility in the kinds of problems that we can go after.”
The development of the 1.5 exaflops Frontier supercomputer, likely to be the U.S.’ first system to reach full delivery, is well underway according to Oak Ridge’s Nichols—and it'll likely start being stood up next summer. The system will be installed in the space that used to house the lab’s Titan system, but with significant updates.
“For example, we gutted the datacenter and upgraded our utilities—2.5 miles of powerlines were installed and a new mechanical plant was built—to provide about 40 megawatts of power and cooling to the Frontier data center,” Nichols said.
He added that application development teams will have early access to Frontier next year. Further, Nichols explained there are two dozen exascale applications presently under development through the ECP. This means there’ll be scalable uses spanning two dozen disciplines ready to run on the up and coming system, he noted, and to help solve some of the world’s toughest problems in quantum materials, chemistry, physics, additive manufacturing, fusion, fission, and more.
“Frontier will support research in solving problems that would have seemed impossible as recently as 5 years ago,” Nichols said, adding that the lab is already “thinking about the system that will follow” it.
And El Capitan is on schedule for full deployment in 2023, LLNL’s Thomas confirmed. The California-based lab recently launched construction on its Exascale Computing Facility Modernization project, which he said involves upgrading the Livermore Computing Center’s mechanical and electrical, so that it may accommodate the forthcoming system and other future machines.
“El Capitan will require about 30 megawatts [or MW] of power to run alone, and this upgrade will take the building’s energy capacity up to 85 MW,” Thomas said. “We will also be eventually adding a smaller unclassified version of El Capitan that will be used for basic science.”
Thomas added that the team is “working on porting and optimizing existing codes so they will work on El Capitan,” in hopes that researchers can hit the ground running and initiate programmatic work immediately after the machine is accepted into production.
With the powerful machine, Lawrence Livermore insiders imagine they’ll be able to routinely run very complex three-dimensional multiphysics simulations at high resolutions to support the lab’s national security missions for the National Nuclear Security Administration. The system will unleash most applications about 10 times faster on average than the world’s third-most powerful supercomputer Sierra, Thomas said, clarifying that this means that many simulations that take scientists a week or more to do on Sierra, “should be turned around in less than a day on El Capitan.” Those capabilities will prove crucial as the lab works to meet the increasing demands of NNSA’s Stockpile Stewardship Program, which ensures the security of America’s nuclear stockpile—in the absence of underground testing.
“A lot can change between now and 2023, but at an excess of 2 exaflops, we anticipate El Capitan will be the world’s most powerful supercomputer by the time it is fully deployed,” Thomas said.
Brandi Vincent is a staff correspondent at Nextgov.
NEXT STORY: OPM keeps hiring on track with USA Staffing upgrades