Barcelona vs Core2

By: David Kanter (, May 16, 2007 12:06 pm
Room: Moderated Discussions
Vincent Diepeveen ( on 5/16/07 wrote:
>David Kanter ( on 5/16/07 wrote:
>>Vincent Diepeveen ( on 5/13/07 wrote:
>>>If core2 can retire 4 uops per cycle and barcelona can >retire 3 uops a cycle i
>>>understand, then core2 can blow that barcelona core >completely away. That's 33% faster speed.
>>Yes and the concord can fly faster than a 787...oh wait, no, it doesn't fly anymore : )
>>I have yet to find any code with > 2 uops/cycle, so Intel's 4th issue/execute/retire
>>slot really doesn't help all that much. Nobody consistently has IPC=3...
>>It helps for clearing queuing up, but it really isn't that important. What IPC does your code get on Core2 anyways?
>It really doesn't matter what software i look to that i >wrote myself, they all
>profit bigtime from a 4th execution unit.

I really don't think that's true.

>Your math model might need a more realistic approach.
>The total speed of your software gets dominated by the >average IPC you can get,
>yet consistently getting above 2.0 is total unnecessary to >already profit from the possibility to execute 4 a cycle.
>Let's show in a sample calculation how relevant your remark >is that it should be 'consistently higher' than ipc 3.0.
>Let's use an example where a program in 80% of the code >cannot profit from moving
>from 3 to 4 integer units, that means that if 20% does >profit.
>That 20% gets a speedup of 33%.
>Your total program speedup then is:
>100% - 80% - (20 * 3 / 4 ) = 5%
>So even for software that hardly can need a 4th unit, >already can easily get a speedup of 5% from it.
>In reality however, many instructions from intel are dead >slow, so the observed
>speedup at chips that can retire 4 instructions a cycle is >far bigger than that 5%.

You're just making up arbitrary numbers and guessing about this. What would convince me that a 4th pipeline really matters is someone profiling their software and telling me:

In 5% of all cycles, we retired 4 uops.

Again, what IPC does your code achieve? In looking at modern games, I have yet to see an IPC greater than 1.5.

