The forces of semiconductor process evolution and x86 market dynamics are forcing MPU products for the low power mobile markets to diverge from those intended for the higher power desktop and server markets. The increasing importance of leakage current power consumption in current and future CMOS processes means that mobile MPU designers cannot throw ever more logic transistors at chasing the tail of diminishing returns of IPC improvement (to paraphrase IBM research staff member Bob Montoye). Success lies instead in maximizing the performance of transistors that are already there. That is best accomplished by attempting to strike a balance between designing for high clock frequency and designing for high IPC.
The most important recent innovation in x86 MPU design is the trace cache. Although the trace cache was originally created to attack the frequency scalability problem of parallel x86 instruction decoders, it also offers an intriguing way to reduce the power consumption of mobile x86 MPUs. A side benefit of the trace cache is it allows high levels of parallelism in uop execution without requiring a concomitant degree of parallelism in the front-end x86 decoder.
A hypothetical 0.18 um low power x86 processor called Cool_x86 was described. It employs a number of performance enhancement and power reduction features including a trace cache, relatively short execution pipeline, improved branch prediction, and large L2 cache. The estimated power, performance, and computational energy efficiency characteristics of the Cool_x86 compare favorably to the recently introduced 1 GHz mobile Pentium III. It will be interesting to see how similar Cool_x86 is to the upcoming Intel ‘Banias’ core, an entirely new x86 core processor explicitly design for low power and mobile applications. The imminent arrival of the AMD ‘Palomino’ will also demonstrate what power reduction measures are possible in an evolutionary design heavily leveraged from an existing core.
Discuss (78 comments)