The heated competition between Intel and AMD for the lucrative and high volume PC marketplace has pushed x86 CISC ISA-based microprocessors into 0.18 um CMOS processes well ahead of any RISC ISA-based processor. Besides higher clock rates, 0.18 um CMOS also permits the integration of large (256 Kbyte) and highly associative, low latency L2 caches within the processor. As a result, x86 processors have temporarily eclipsed the integer performance of virtually every RISC processor. However, as always, x86 processors are hopelessly behind nearly every non-embedded control RISC processor family in floating point performance.
There is no doubt about the power and influence of the x86 processor market within the semiconductor industry. Last year over 100 million x86 processors were sold bringing in well over $20 billion in revenue and $ billions in profit. Contrast that to, say, the Compaq Alpha processor, which might sell in the several hundred thousand devices per year range and bring in several hundred millions of dollars of revenue to Compaq’s semiconductor partners. This is why new x86 cores come to market at a much faster pace than high-end RISC processors and transition to newer and better semiconductor processes with less delay.
In Table 2 is a case study comparison of best-of-breed modern x86 and RISC processor designs built in approximately equivalent CMOS process technologies. This comparison is made more intriguing by the fact that several key designers of the EV67 core (Jim Keller and Dirk Meyer) left Compaq/DEC and went on to help design the K7. Also, the two processors share the same system bus architecture.
| ||AMD K7 Athlon||Compaq Alpha EV67|
|Technology||0.25 um CMOS||0.28 um CMOS|
|Die Size||184 mm2||205 mm2|
|Transistors||28.1 million||15.2 million|
|Package||240 SEC module||588 CPGA|
|Clock Rate (MHz)||700||700|
|SPECint95||31.7 1||39.1 2|
|SPECfp95||24.0 1||68.1 2|
|Note||1 with 512 Kbyte external cache||2 with 8 Mbyte external cache|
Although it appears that the performance gap has mostly closed for integer code, faster 0.28 um EV67s (reportedly 833 MHz) are already in beta testing in systems at customer sites while faster K7s require a more advanced 0.18 um process. Unsurprisingly, the huge floating point performance gap is still present, although some of that is attributable to the disparity in L2 cache size. Also note that the K7 has almost twice as many transistors as the EV67 despite the fact both designs implement 64 Kbyte instruction and data caches. This reflects the CISC “complexity tax” imposed on modern x86 processor designs regardless of the similarities in the back end execution engine.
While the integer performance gap between the best RISC and CISC processors has closed over the last thirteen years, the deep and fundamental differences between the two architecture design concepts have not. The “RISC and CISC are converging” viewpoint is a fundamentally flawed concept that goes back to the i486 launch in 1992 and is rooted in the widespread ignorance of the difference between instruction set architectures and details of physical processor implementation. Modern out-of-order execution x86 and RISC processors *do* have very similar organization in their back end execution engines, both of may which contain 40 or more physical renaming registers. While RISC data paths are driven directly by RISC instructions, x86 data paths are similarly driven by sequences of simple, shallowly encoded microcode-like control words called micro-ops, or provocatively, RISC-ops.
Even if the physical implementation advantages RISC designs enjoy over CISC designs, like the x86, could be reduced to zero (a prospect which is demonstrably remote), it doesn’t change the fact that the ISA, the programming model targeted by compilers, is vastly different. The modern x86 processor might have 40 physical general purpose registers for renaming, but the compiler can only target the 8 GPRs visible in the ISA. It doesn’t matter that the modern x86 processor has a back-end execution engine that is controlled by RISC-like control words; these micro-ops are inaccessible from the outside world and the compiler cannot target them. The x86 compiler cannot perform many of the standard RISC compiler optimization techniques that strongly depend on large register sets, three address instruction formats, and the absence of non-register based dependencies between instructions.
This fundamental and inescapable nature of the difference between RISC and CISC computer design is the driving force behind both Intel’s development of the 64-bit RISC-like IA-64 family of processors to eventually replace x86, and AMD’s decision to add a RISC-like large flat floating point register file and three address floating point instructions to beef up the performance of the 64-bit extended x86-64 ISA for its upcoming K8 processor.
Be the first to discuss this article!