Article: Parallelism at HotPar 2010
By: someone (someone.delete@this.somewhere.com), August 7, 2010 2:35 pm
Room: Moderated Discussions
Mark Roulo (nothanks@xxx.com) on 8/6/10 wrote:
---------------------------
>Richard Cownie (tich@pobox.com) on 8/6/10 wrote:
>---------------------------
>>Also it seems to me that the biggest slowdown of per-clock
>>single-thread performance in Nehalem is the increase
>>in L1 latency from 3 cycles to 4 cycles. I don't see an
>>obvious reason why that helps throughput; my suspicion
>>is that it came from detailed experiments on cache design
>>and latency and clock speeds for the target process
>>(i.e. 32nm for Nehalem), showing that an 3-cycle L1 cache
>>in 32nm would limit clockspeed. So they moved to 4-cycle
>>L1 to avoid that bottleneck.
>>
>
>Wouldn't one reason to increase the L1 cache latency be to enable much faster clock
>speeds if necessary?
The integer data paths in Nehalem use static CMOS logic
forms, not self-timed dynamic circuits like P4 and C2D.
Even with the more balanced PFET/NFET drive in Intel's
45 nm and 32 nm processes there is a peformance cost.
---------------------------
>Richard Cownie (tich@pobox.com) on 8/6/10 wrote:
>---------------------------
>>Also it seems to me that the biggest slowdown of per-clock
>>single-thread performance in Nehalem is the increase
>>in L1 latency from 3 cycles to 4 cycles. I don't see an
>>obvious reason why that helps throughput; my suspicion
>>is that it came from detailed experiments on cache design
>>and latency and clock speeds for the target process
>>(i.e. 32nm for Nehalem), showing that an 3-cycle L1 cache
>>in 32nm would limit clockspeed. So they moved to 4-cycle
>>L1 to avoid that bottleneck.
>>
>
>Wouldn't one reason to increase the L1 cache latency be to enable much faster clock
>speeds if necessary?
The integer data paths in Nehalem use static CMOS logic
forms, not self-timed dynamic circuits like P4 and C2D.
Even with the more balanced PFET/NFET drive in Intel's
45 nm and 32 nm processes there is a peformance cost.