The fastest computers are going hybrid
Connecting state and local government leaders
During the past decade, the biannual list of the world's fastest supercomputers has become increasingly dominated by systems that use a mix of processors, including commodity processors produced by Intel and Advanced Micro Devices.
Automobiles aren’t the only machines taking a hybrid approach. Judging by the recent SC08 conference in Austin, Texas, the future of supercomputer design seems to be heading toward using multiple types of processors in a single system. That approach is a significant change in the supercomputing field, and like any major shift in technology, it comes with hidden problems.
In the past decade, systems that use commodity processors produced by Intel and Advanced Micro Devices have increasingly dominated the biannual Top500 list of the world’s fastest supercomputers compiled by laboratories at the Energy Department and a group of universities.
Although not as powerful as vector processors built specifically for the high-performance computer market, those chips are much less expensive and offer more processing power per dollar when bought in bulk.
Recently, however, developers began augmenting commodity processor-based supercomputers with specialty processors, such as floatingpoint accelerators, field-programmable gate arrays, repurposed graphics processing units (GPUs) and even IBM’s Cell Broadband Engine (Cell/BE) processors, which were designed for video game consoles.
For example, developers of the top computer on the most recent Top500 list — Los Alamos National Laboratory’s Roadrunner, a 1.1 petaflop IBM machine — augmented its AMD Opterons with IBM PowerXCell processors. And on the Green500 list, which is the Top500 reordered by power efficiency, the top seven computers all ran on IBM Cell/BE-based BladeCenter QS22 servers.
Why the shift? Better power usage.
“Power performance has become a very important metric as of late — some feel even more important than [simply] performance,” said Kaushik Datta, a graduate student in computer science at the University of California, Berkeley. Datta presented the results of a study he led about the best ways to design multicore systems at the SC08 conference.
Although the Top500 list ranks machines by how many floating-point operations/sec (flops) a machine executes, the Green500 ranks them by how many flops per watt a machine executes. In that realm, specialized processors rule. One industry expert at the conference estimated that the Cell/BE can produce about 14 flops for about 97 watts of energy, and a GPU can produce about 2 flops per watt. Meanwhile, a generic x86 processor can produce only about 1 flops at that wattage.
“As you specialize the chip, you’re able to be much more efficient with what you are doing with the flops,” Timothy Mattson, a senior research scientist at Intel, said during a talk on the company’s experimental 80-core Tera-scale processor.
Of course, new architectures require developers to rework their code. We hear that the Cell/BE, which is still in its infancy, has an especially steep learning curve for programmers.
“Are you willing to put in the time to program” for these environments? Datta asked rhetorically. That is the question system builders and developers will have to ask themselves while hungrily eyeing performance gains.
NEXT STORY: Council honors public servants