Intel has come up a surprising number of microprocessor design innovations to power its new Willamette 32 bit x86 processor targeted at replacing the venerable, but aging, P6 core. The trace cache extends previous work in this area in several important ways, and solves some of the known nagging problems. The trace cache control logic handles complex x86 instructions separately so as to reserve cache capacity for the common and fast instructions that translate to just one or a few uops. It also removes the x86 decoders from the critical execution path and decouples uop issue rate from the number of x86 decoders for the majority of the time when previously translated uops are re-executed in program loops.
The Willamette’s double clock rate arithmetic and logic units (ALUs) are an interesting development in an area of microprocessor design that has changed very little since the earliest days. The double clock rate ALUs provide the expected benefit of allowing two physical ALUs to perform the work of four logical ALUs as far as the rest of the processor is concerned. The biggest surprise of all is that dependent chains of instructions can issue to this superpipelined ALU at half processor clock intervals. The exact details of how this is accomplished by Intel and what restrictions apply are not yet disclosed, but I have shown one possible way this feat could have been done without imposing onerous issue rules.
The effect of Willamette on the competitive landscape will be enormous. It puts the onus back on AMD to seriously improve on the K7 and make more than cosmetic changes to its basic design to create the K8. Willamette represents a double-edged sword for Intel outside the x86 world. It will keep up the heavy pressure on RISC processors in the workstation and low-end server market, especially in the form of the Foster high end variant. It might even challenge the mighty Alpha EV68 for the SPECint crown. On the other hand it will make it very difficult for Intel’s own Merced/Itanium IA-64 processor to make a mark for itself on the basis of performance, except for very floating point intensive applications. In summary, the Willamette appears to be a tremendous technical achievement. If Intel can put even half of the imagination and innovation shown by their Hillsboro design team into McKinley, then its competitors in the 64 bit market could suffer the same fate as competing x86 designs when Willamette ships in volume.
Be the first to discuss this article!