# What’s Up With Willamette? (Part 2)

Pages: 1 2 3 4 5 6 7 8

### How the Heck do They Do That?

The obvious conclusion is that Intel is has rolled out some serious innovation in integer datapath microarchitecture for Willamette and possibly some new circuit techniques too. I will put forward one possible explanation of how the Willamette ALUs are implemented. It may not be exactly what Intel has done but the odds are it is not too far off. First of all, the double clocking of the ALU has a lot of people amazed. There are three basic techniques for making a small section of circuitry operate at twice the global clock rate. The first is to divide the unit into two sections and clock one normally on the rising edge of the clock and the second section on the falling edge of the clock and then interleave and combine the results. The second is to employ flip-flops (memory circuits) that operate on both edges of the clock The third method is to simply provide a special clock signal to the unit that toggles at twice the frequency as the processor clock.

Intel has apparently chosen the third option. In U.S. patent 6,023,182 Intel discloses a clock buffering circuit that, among other things, generates an out put clock at twice the frequency of the input clock using one shot pulse generators triggered on both the rising and falling edges of the input clock. This circuit can be dropped down anywhere on a microprocessor to drive a functional block such as an ALU at twice the rate of the globally distributed processor clock. The advantage of this scheme is that it avoids the formidable challenge of generating and distributing both a regular clock and a double frequency clock over the entire device. The operation and likely use of this clock frequency doubling buffer in Willamette is shown in Figure 6.

Figure 6. Hypothetical Willamette ALU Clocking scheme

So, we know how Intel can clock its ALUs at 3.0 GHz while surrounded by a sea of logic operating at 1.5 GHz., but I still haven’t explained how an addition can apparently be performed in the 0.20 ns or so needed to fit into a single stage of a pipeline operating at 3.0 GHz. Could Intel have discovered a new way of performing addition that only takes 40% of the time needed by standard adder circuits used today? This possibility is extremely remote. Engineers and mathematicians having been studying binary math and arithmetic circuit design since the days computers were built using electromechanical relays 55 years ago and all the possibilities have been pretty well gone over. There are known techniques, such as logarithmic adders, that can add faster than the carry lookahead design typically used in microprocessors, but the speed up for a 32 bit addition is not that great and comes at a fairly large cost in area.