Alpha EV8 (Part 1): Simultaneous Multi-Threat

Pages: 1 2 3

Wider Issue Superscalar: Modest Gain

It is fairly well known that increasing the instruction issue width of a processor brings diminishing returns in extra performance. Going from a single-issue (scalar) processor to a two issue processor can nearly double performance of many programs. Doubling the issue width again to four instructions brings a smaller increase and so on. In addition, the performance increase wider issue brings tends to be more erratic, especially for integer programs. That is to say there is a greater variation in speedup from program to program, depending how easy it is to extract instruction level parallelism (ILP) from the code.

These effects can be seen in the approximate range of IPC (instructions per cycle) for four generations of Alpha processor core designs shown in Figure 2. The data for the EV5 and EV6 is taken from [7] while the EV4 and EV8 performance is estimated (The EV7 was not included because it uses essentially the same core as EV6). It should be noted that a significant part of the IPC increase from EV5 to EV6 is due to the addition of dynamic scheduling (out-of-order execution) and a greatly improved cache and memory interface so it is larger than what could be expected from only an increase in issue width.

Figure 2 IPC Performance of Alpha Processor Cores

Because the performance gain falls off with increasing issue width while the logic complexity increases at least quadratically, modern CISC designs (i.e. x86) have stalled at three-issue width while RISC designs rarely exceed four-issue width. But high end MPU designers are rarely dissuaded by a technical challenge and the EV8 was the next logical step to achieve higher performance.

In a remarkable case of serendipity, Alpha designers became aware of academic research on an promising new idea called simultaneous multithreading (SMT) that could potentially overcome most of the inherent inefficiency of very wide issue superscalar processors with very little extra hardware. Alpha architect Joel Emer has publicly stated that the EV8 was going to be eight-issue wide anyway to capture the straight forward but limited increase in IPC, and achieve higher architectural performance than EV6/EV7. But it is SMT technology that will really put the icing on the EV8 cake and strike fear into the heart of proponents of Intel/HP’s Explicitly Parallel Instruction set Computing (EPIC) and Chip level Multiprocessing (CMP) approaches to high end processor design.

In the second part of this article I will describe SMT and its effect on the design and performance characteristics of the EV8. The impact of SMT on EV8’s competitive posture relative to alternative design approaches like EPIC and CMP, and implications for the future of MPU design will be explored in the third and final installment.

[1] Benschneider, B. et al, A 1 GHz Alpha Microprocessor, Digest of Technical Papers, ISSCC 2000, p. 86.

[2] Benschneider, B. et al, A 300-MHz 64-b Quad-Issue CMOS RISC Microprocessor, IEEE JSSC, Vol. 30, No. 11, November 1995.

[3] Gaddis, N. et al, A 64-b Quad-Issue CMOS RISC Microprocessor, IEEE JSSC, Vol 31, No. 11 November 1996.

[4] Farrell, J. et al, Issue Logic for a 600-MHz Out-of-Order Execution Microprocessor, IEEE JSSC, Vol. 33, No. 5, May 1998.

[5] Fischer, T. et al, Design Tradeoffs in Stall-Control Circuits for 600 MHz Instruction Queues, Digest of Technical Papers, ISSCC 1998, p. 232.

[6] Lo, J. et al, Converting Thread-Level Parallelism to Instruction-Level Parallelism via Simultaneous Multithreading, ACM Transactions on Computer Systems, Vol. 15, No. 3, August 1997, p. 322.

[7] Cvetanovic, Z. et al, Performance Analysis of the Alpha 21264-Based Compaq ES40 System, Compaq Computer Corporation, 2000.

Pages: « Prev  1 2 3  

Be the first to discuss this article!