Digital Thermal Sensor, SSE4 and other Improvements
While each of the subsystems have been substantially improved in Merom, there are also some new technologies that improve the entire MPU, rather than one particular section.
Merom includes an on-die digital thermal sensor, which has some rather interesting applications. The thermal sensor is used for safety and reliability features, as thermal diodes were previously. However, it is also rumored that it may be used to increase the frequency of the MPU when the sensor determines that there is thermal headroom. This is reminiscent of, but technically distinct from Foxton, which is much more tightly integrated into the MPU and measures current rather than heat. If this technology is productized, it will probably be for Conroe and Woodcrest only. The notion of using more power and producing more heat in exchange for better performance runs counter to the goals for mobile MPUs; therefore it is unlikely to end up being enabled for Merom.
Most of the MPU is heavily clock gated for more efficient operation. Each of the two cores is managed independently, and many entire blocks can be put to sleep, such as the microcode sequencer. Most internal buses are also gated for power savings. So if a bus is not sending out a full data load each cycle, part of the bus can be put to sleep. For example, if the FPU bypass network was mostly working with 64 bit operands, the part of the network that sends and receives bits 64-127 could be turned off to save power. In 99.9% of all cases, this gating has no impact on performance. This ability to shut off parts of the chip was integral to the entire design philosophy. Normally, increasing the IPC of a design means higher power consumption, whether the extra resources are used or not. With the extensive clock gating, the designers only pay for what is used, which makes a high IPC design much more attractive.
Merom also includes several new instructions, which were originally planned for Tejas. Tejas was a troubled super-pipelined design that would succeed Prescott, but was cancelled after the power consumption and heat dissipation problems became known. These new SSE4 instructions are not terribly interesting; there will be some performance gains, but nothing like the improvements from SSE2. One of the reasons why the performance gains will not be substantial is that most of the instructions are special purpose. Also, there is an architectural mismatch; Tejas was much more like the P4 than Merom. Naturally, it makes sense that instructions designed for Tejas might not benefit Merom as much. However, Intel is working on extensions for Penryn, the 45nm follow on to Merom, and it is rumored that these new instructions will be much more significant.