Large, expensive... and after five years, usually trash
Connecting state and local government leaders
When a still-powerful agency supercomputer is past its prime, finding a new home for the system is rarely simple.
Supercomputers are the supermodels of the IT world – sought after, larger than life, high intensity and with a career lifespan of about five years. But when the younger, faster set plugs in, the old is out. And with everyone focused on the latest models (pun intended), do these expensive -- think multiyear contracts in the $50 million range -- and still-viable systems get put out with the trash?
Process as usual
The answer is yes -- at least for the majority of them. The most efficient, secure and financially feasible way to do it is by using a computer wood chipper, provided by contractors who specialize in IT asset disposition. This is true especially for the supercomputers with high-level security data.
“There are two components to this,” explained Jeff Broughton, the National Energy Research Scientific Computing Center's deputy for operations and systems department head. “One is the actual disks on which the files are stored. And those we have typically sent off to firms where they grind up the disc, and so that is that there is no chance of people coming in and taking data off of it.” And while NERSC itself doesn’t have any high security data, he said, the agency does it out of “an abundance of caution.”
So why doesn’t NERSC and other supercomputer-using government agencies try to find someone who can still use such a large and powerful machine? They do, said Broughton. There is a multistep process the government goes through to try to find the machines a new home, but it often doesn’t have a happy ending for the computer.
The first possibility is to try and trade in the supercomputer on a replacement with the contractor, what Broughton called the easiest and best method. If the agency is buying the next-generation from the current manufacturers, it can “sometimes arrange to trade it in, and they can use the parts for spares for things that are out in the field,” he said.
If that isn’t an option -- as in the case of NERSC's recently retired Hopper supercomputer, whose parts don’t jive with the latest versions Cray is producing – then agency IT managers move to the second strategy and repurpose anything they can use in-house.
The third strategy, if the first two aren’t feasible, is to put the old supercomputer through a process via GSAXcess – the General Services Administration's clearinghouse for distributing unused government property. “Basically what happens is the asset is first offered to any other federal agency to see if they want it," Broughton said. "The second priority is to offer it to state and local governments. And then, from there, it is offered out to the public."
In theory, that's a bargain for the prospective new owners -- but then they have to come and get it. That means figuring out how to get it out of its original space and then reinstall it however and wherever they like. “And if someone actually wanted the system to be workable, they would need Cray to come in and take it down and take it apart and reinstall in a professional manner.” The cost of such a repurposing can range as high as $400,000, Broughton said.
Finally, if there still aren’t any takers via GSAXcess, it's usually off to the wood chipper.
A happy retirement
For a lucky few, however, a program called PRObE (Parallel Reconfigurable Observational Environment) works to repurpose supercomputers so that researchers and students who otherwise might not have access to such systems can use them. The program is a partnership of National Science Foundation (which provides the funding), New Mexico Consortium, Los Alamos National Laboratory, Carnegie Mellon University and the University of Utah.
“They [supercomputers] usually get put in a truck and to a big chipper and ground down to dust," said Andree Jacobson, CIO at the New Mexico Consortium, which runs the PRObE program. "And there are some people in government that think that isn’t a good thing because it is an often waste of money because the computers are very powerful.” And while these hand-me-down systems might not be the most up-to-date models, they provide research opportunities for scientists and students that would otherwise not have access to a supercomputer.
The process, Jacobson said, is not a simple or quick one -- because it isn’t the entire supercomputer that gets a happy retirement. “We can never receive any storage -- hard drives, flash memories -- any kinds of those things where data has been stored," he said. "That is considered a security risk. But the computer themselves -- the processors, memories, motherboard -- they are not used to store that kind of data. So with the right person who advocates for it, there are ways the hardware could be repurposed.”
It would be easier if one could just leave the system where it is, Jacobson explained, but that isn’t what usually happens. And so serious labor is involved.
“These computers take up whole rooms of space," he noted. "Let’s say one computer is 4,000 machines. That could be 100 computer racks. And let’s say you have to install a new hard drive in each, and it takes 5 minutes to do that and multiply that by 4,000 -- all of a sudden that becomes a lot of hours,” Jacobson said. And the same thing holds true when it comes to cabling and other peripheral systems. “There’s always the chance a cable breaks or the layout of a room isn’t the same, so the cable might not be the right length or the cooling system might be different.”
The whole transition process can take two to three months, according to Jacobson. “We have rebuilt at least three computers with more than a 1,000 machines in them, and it will take two months for a couple people to touch every computer and troubleshoot it," he said. "And because the hardware is old and finicky, you run into issues that you might not have with a new computer.”
Many people feel that the effort is worth it, however -- and that is why it is working, said Jacobson. “The primary purpose of PRObE is to essentially provide a unique research tool at scale for the National Science Foundation, and that means that we provide our supercomputers to researchers across the nation.”
And putting more supercomputers into the research community is also key to understanding the next generation of computer-intensive research at a strategic level. “The PRObE project is trying to provide a large number of computers for the people who are looking into seeing what the next generation of large computers is going to look like," Jacobson said. "It is much harder to do that if you only have access to 10 computers, rather than hundreds or even thousands of them.”
A current list of projects using PRObE resources is available on the website.
Editor's note: This article was changed Dec. 21 to clarify the funding of PRObE and to expand Andree Jacobson's statement on the program.