4 on the floor

By Edmund X. DeJesus

Connecting state and local government leaders

| July 27, 2007

Quad-core chips promise better performance and lower power consumption. Best of all, they'll fit into dual-core slots.

The allure of new quad-core chips from Advanced Micro Devices and Intel is pretty straightforward: faster processing. But, like dual-core systems that have become so common, they can deliver speed while reducing heat and power consumption.

'All [information technology] used to say was, 'Give me more clock speed,' ' said John Fruehe, worldwide business manager for AMD's Opteron chip. But physics caught up with chip design: High speeds produce a lot of heat, putting a limit on how fast they can operate. Multiple cores, even running at slower speeds, increase the processing power of the chip.

'Rather than doing one thing faster, multicores do many things at the same time, each slightly slower,' said Fruehe.

'Chip manufacturers believe quad-core is better and faster,' said Roger Kay, president and founder of Endpoint Technologies Associates. As inevitably happens with every level of chip, dual-core processors have run into the usual walls: performance, power, heat and cost. With quad-cores, manufacturers can decrease the clock speed ' without sacrificing performance ' which lowers the heat and accompanying power needs.

All this should sound good to agencies: better performance, lower power demands and less heat to handle. To make it even easier, most modern operating systems ' including Windows, Linux and Solaris ' innately recognize multicore chips, support multithreading and will take care of multitasking chores pretty handily, even if the applications themselves aren't multithreaded. And if you really want to wring the cycles out of quad-core chips, you can always parallelize your application: After all, you'll probably only have to do it once.

Even desktop machines can take advantage of quad-cores ' with operating-system help, e-mail can run on one core, a Web browser on another, antivirus protection on a third and word processing on the fourth. But servers can derive an even greater benefit. For example, a Web server, which must routinely handle multiple separate visitors and requests, can run each one on a different core.

AMD will introduce its new quad-core chip, part of the K10 family, by the end of the summer. Alternately dubbed Barcelona or Phenom, the new chip will be a 'true' quad-core, namely with four processors on the same die. Barcelona chips will be made with an etching process capable of placing rows of transistors 65 nanometers apart. The chips will offer 512 K of dedicated L2 cache per core, plus 2 M of shared L3 cache.

Easy switch

Of probably greater importance to agencies is the design decision that new AMD quad-core chips will fit in the same Opteron sockets as existing dual-core chips. So you can move from dual-core to quad-core by swapping chips and upgrading the BIOS.

The budgetary implications of this are interesting. For many agencies, money for new hardware comes from a different pot than money for upgrades. AMD's design means that agencies can upgrade to quad-cores without dipping into the capital budget.

There's similar good news about power and thermal behavior. Barcelona will use the same amount of power and produce the same ' or possibly less ' heat, as the current dual-core chips. This means that no changes are needed in power or temperature support to handle the quad-cores. As the user base grows and agencies must expand capacity, they can accomplish a lot more with the same platforms in the same physical space.

Barcelona is expected to deliver 50 percent to 80 percent performance and performance-per-watt improvements over current Opteron processors.

Its successor, code named Shanghai, also will use the same socket as existing dual-core chips. AMD likely will move to 45-nanometer technology by early 2008, which will allow for either faster speeds or lower power and temperature.

Intel, meanwhile, released a quad-core chip in 2006, although some question whether it is a true quad-core. Unlike AMD's Barcelona ' expected to include four cores on one die ' Intel currently uses two dual-core dies sharing a bus. Does that make a difference? 'The distinction is functionally irrelevant,' Kay said.

The reason for Intel's approach is manufacturability. 'The bigger the die, the smaller the yield,' said Nick Knupffer, global communications manager at Intel. By making quad-core chips this way, Intel can get higher yields, lower costs, faster time to market with new designs and quad-core-level processing at lower prices.

Intel's quad-core chips fit into the same Xeon sockets as its dual-core chips, similar to AMD's planned quad-core strategy. This means that it's possible to upgrade from dual-core Intel chips to quad-core by swapping chips. 'The quad-cores have the same thermal envelope as the dual-cores,' Knupffer said. Intel achieves this by throttling back the clock speed on the quad-cores. Despite reduced clock speed, Intel anticipates a 45 percent increase in performance for high-bandwidth applications.

However, Intel will also offer unthrottled versions of its quad-cores, where the chips can operate at their highest possible speeds. 'These are for the power markets, where performance is at a premium,' Knupffer said. Computers using unthrottled quad-core chips must use appropriate cooling and power.

Intel's next quad-core chip for servers will be in production by the end of this year. Code named Penryn, this chip not only has several features that will be helpful for users, but genuine advances in technology. Penryn will use a 45-nanometer process ' meaning features only 200 atoms wide ' to reduce the size of the chip, allowing more chips to come out of the same wafer for more economical production.

The company also will use new materials in Penryn. They will replace the traditional silicon in transistors with the element hafnium, one of the new, so-called high-K dielectrics, and use metal gates. The result will be cooler, faster processors, reducing power consumption.

Penryn also is expected to have 45 new chip-level instructions. The new instruction set, called SSE4, is oriented toward accelerating media applications, such as video and games, and 3-D processing. This will be useful for rendering in gaming and video.

One successor to Penryn will be called Nehalem, scheduled for release in 2008 (in 45-nanometer and, eventually, 32-nanometer versions). Neshalem will have up to eight cores. Another, Gesher, scheduled for 2010, will be a 32-nanometer offering. Intel aims to release 22-nanometer chips in 2011, possibly as a version of Gesher.

Divisible by four?

The proliferation of processors within a computer does create one problem ' how do you actually use all the cores?

The thinking used to be that the best ' and only ' way to make the most of multicore chips was to assign one task per core. For a game, this might mean putting background graphics on one core, motion on another, sound on another and input on another. For typical agency applications, perhaps the best approach is to rewrite the software to make it more parallel, so the cores can handle separate computational tasks. Of course, that works if you have a software development budget you don't know how to spend any other way. Most agencies don't have that luxury.

'Realistically, you can use as many cores as you can get,' Kay said. For example, suppose you're trying to render video. Strict parallelization might not be necessary. Instead, the software can divide up the tasks by time, assigning chunks of time to each core, then stitching it all up again at the end. Some programs are simply not inherently parallel, but that doesn't mean you can't take advantage of all those cores.

You don't have to rely on blind luck to let you know if your applications can benefit from parallelizing in multiple cores. Analysis tools can examine your code and tell you which parts are independent and can safely run in parallel. Other parts, which need the same data or intermediate results and therefore aren't independent, won't work well in parallel: You'd have to stop core X until core Y is done. With the results from the analyzer, you can make your application as parallel as you can.

If you do happen to be developing applications from scratch, try to conceive of the application as parallel from the start. There are plenty of development tools available from chip manufacturers to help you write software in parallel.

Edmund X. DeJesus is a freelance technical writer in Norwood, Mass.

NEXT STORY: And then there were few

This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. / Do Not Sell My Personal Information

Accept Cookies