The Spider and the Mountain

Pages: 1 2 3 4 5 6 7 8

Multithreading: The Spider’s Bite

The enormous potential of SMT is shown by the expectation that it can approximately double the instruction throughput of an already impressive monster like the EV8 at the cost of only about 6% extra die area over a single threaded version of the design. That is a bigger speedup than can be typically achieved by duplicating the entire MPU as done in a 2 way SMP system! The EV8 combines an incredibly powerful processor core, unequaled at exploiting either ILP and TLP, with the EV7 scalability hardware (integrated router, interprocessor communication links, and memory controllers). An aggressively out-of-order execution, 8 wide issue superscalar RISC core like the EV8 would have likely achieved leadership performance all on its own. But when its SMT capabilities are combined with existing, proven, auto-parallelizing compiler technology, even many FP intensive compute tasks written as single threaded programs might see a further 30% to 50% speedup.

It must come as no small relief to Compaq’s competitors in the high end systems market, as well as Intel, that EV8 will never reach silicon, let alone commercial shipment. Had it been delivered in anything approaching a timely manner it would have likely achieved a performance domination over all other 64 bit competitors at least as great as occurred at the introductions of the EV4, EV5, or EV6. The potential of the EV8 design is shown in Table 2 in the context of the McKinley and EV7 as well as a 0.13 um, third generation IA64 device code named Madison.

<b>Table 2 EV8 in Context</b>
&nbsp;

McKinley

EV7

Madison

EV8

Introduction

2H02

2H02

2H03 est.

2H04 est.

Technology

0.18 um Al

0.18 um Cu

0.13 um Cu

0.13 um SOI/Cu

Clock Speed

1.0 GHz

1.2 GHz

1.8 GHz est.

1.8 GHz

On-chip Cache

3.0 MB L3

1.75 MB L2

6.0 MB L3

3.0 MB L2

SPECint_base2k

800 est.

1050 est.

1250 est.

2000 est.

SPECint_fp2k

1400 est.

1500 est.

2200 est.

3200/4500* est.

*Auto-parallelized, run with SMT.

The EV8’s cancellation is also a major blow to the proponents of SMT. Although the so-called hyper threading capabilities in the Pentium 4 comprise basically the same form of SMT as would have been implemented in the EV8, the x86 chip’s frugal execution resources, inherent chip and system level bottlenecks, and support for only two threads seem to limit instruction throughput increase to about 20% in practice. There are reports that the UltraSPARC-V will incorporate SMT, but Sun has failed to demonstrate mastery of out-of-order execution processor design, virtually a precursor technology to SMT. That fact, as well as its worsening business situation, suggests healthy scepticism about when, or even if, an SMT capable US-V might reach the market. A cynical observer might already be pondering which remaining MPU players Sun might approach to pawn its US-V design team off on. To “compaqt” its workforce, so to speak.


Pages: « Prev   1 2 3 4 5 6 7 8   Next »

Be the first to discuss this article!