Intel’s Teraflops Research Project
One of Intel’s research areas is what they call “Terascale Computing”. This research is really about discovering how to deal with computer architecture in the next decade or so. One element of this research, a project code-named Polaris, is a chip that delivers over a teraflop of performance. The first silicon prototype of this Teraflops chip was presented at ISSCC 2007 by members of the design team.
“Terascale Computing” is an Intel marketing term for research projects that are investigating how to take advantage of future process scaling and exploit greater parallelism. The first few generations of multicore products have been relatively straight forward extensions of conventional thinking. Right now, most MPU vendors are shipping products with 2-4 identical cores. For the next generation integrating 4-8 cores, with more cache, more memory bandwidth and more system functionality seems fairly reasonable. However, beyond that point, architects cannot continue using the same tricks with 16 or 32 cores. There are a variety of problems: yield, power consumption, thermal density, memory bandwidth and latency, ease-of-programming, efficiency, etc. Just as a simple example, look at the CELL microprocessor, which includes one PowerPC core, and 8 SIMD cores, connected using a ring topology. CELL is dominated by logic, rather than SRAM arrays, and according to IBM has 10-20% yields, which is fairly poor. The solution to this problem for Sony was to only require 7 SIMD units, increasing the number of good dice per wafer.
The whole purpose of Terascale Computing is to explore the possibilities and figure out what will work (i.e. redundancy) and what won’t (i.e. dense logic with no redundancy) for the product groups at Intel. The Terascale chip is entirely research oriented and is in no way shape or form going to be productized. The teams responsible for this project were spread across several sites at Intel: Washington, California, Oregon, Arizona and India. The main goals for this project are to explore various options for clock distribution, interprocessor communications, power management and general design philosophy.