Article: Nehalem Performance Preview
By: Michael S (already5chosen.delete@this.yahoo.com), April 11, 2009 5:12 pm
Room: Moderated Discussions
Vincent Diepeveen (diep@xs4all.nl) on 4/11/09 wrote:
---------------------------
>Michael S (already5chosen@yahoo.com) on 4/10/09 wrote:
>---------------------------
>>Vincent Diepeveen (diep@xs4all.nl) on 4/10/09 wrote:
>>---------------------------
>>>Jack (jumpingjack6@verizon.net) on 4/9/09 wrote:
>>>---------------------------
>>>>Vincent Diepeveen (diep@xs4all.nl) on 4/7/09 wrote:
>>>>
>>>>>
>>>>>How can you draw a conclusion about Shanghai, you haven't even compared it head on with Nehalem yourself.
>>>>>
>>>>>Vincent
>>>>>
>>>>
>>>>David did characterize, at the beginning of the article, that Shanghai would be
>>>>fairly characterize as slightly lagging harpertown, in that it falls behind in some
>>>>cases, achieves parity and others, and has some strong points.
>>>>
>>>>Considering that is roughly a good assessment, then it can be extrapolated that Nehalem has opened up a wide margin.
>>>>
>>>>Nonentheless, you can search the databases yourself, the 5570 DP Xeon can range
>>>>anywhere from 1.5 to 2x faster than a Shanghai 2P (2.7 GHz). I have not found one
>>>>where Shanghai even comes close. This does not make Shanghai a bad CPU, but it
>>>>does make it tough for AMD to market Shanghai against Nehalem.
>>>
>>>Sales ballony, based upon a few cracked spec tests.
>>>
>>>Both are nearly identical processors in performance for the software we tried.
>>>
>>>Of course HT and turboboost turned off, and Shanghai a tad higher clocked than
>>>E versions of Xeon, gives Shanghai a slight edge in clockrate 2.53Ghz vs 2.7 shanghai.
>>>Of course with more powerbudget intel clocks higher.
>>>
>>
>>BTW, AMD submitted SpecJbb scores for 2.9GHz Shanghai. In the past it was the
>>indication that next lower clocked part, i,e, 2.8GHz, will soon be available in
>>normal thermal envelop. So there is a hope for 2.8GHz 75W Shanghai coming.
>>
>>>If you look to the intel documents in what i7 can execute it is SSE2+ instructions
>>>a cycle max. That gives 8 flops as a max, with or without HT. Is that so much higher than AMD?
>>>
>>>Multiplication is not faster than at AMD in throughput, in fact if you try latencies
>>>of AMD are better, so a good programmer CAN be faster at AMD.
>>>
>>
>>Let's follow you own logic. Floating-point addition is faster (=had shorter latency)
>>on Intel. Should we conclude that "a good programmer CAN be faster at Intel".
>
>There is 1 unit that is doing multiplication,
>there is a lot that can do addition.
>
>Addition has a latency of 0.5 cycle at intel so to speak and 0.33 cycle or so at
>AMD (i could be off by 0.17 or so as i checked the i7 handbooks quickly a while
>ago for all kind of stuff, not the AMD ones).
>
Oh, I forgot that you don't that you don't understand the difference between latency and throughput.
Sorry.
But I didn't know that you don't understand the difference between integer and floating point addition :(
Or may be you just plain don't know that bot Intel and AMD processors have only one (admittedly, wide) FP_ADD unit - 3-clock latency on Intel, 4 on AMD?
>Multiplication is important for FFT
Do you happen to know that Radix-2 butterfly consists of 6 additions ans 4 multiplications? Do you happen to know that each dependency chain in radix-2 butterfly includes 1 multiplication and 2 additions?
>and matrix calculations.
Where only addition is a part of long dependency chain so unrolling requirements (# of accumulators) depend solely on the latency of FP addition.
>Adding goes rather
>quick. Enough units to do it. Just 1 for multiplication.
Why do you insist on talking about things you have zero clue about?
---------------------------
>Michael S (already5chosen@yahoo.com) on 4/10/09 wrote:
>---------------------------
>>Vincent Diepeveen (diep@xs4all.nl) on 4/10/09 wrote:
>>---------------------------
>>>Jack (jumpingjack6@verizon.net) on 4/9/09 wrote:
>>>---------------------------
>>>>Vincent Diepeveen (diep@xs4all.nl) on 4/7/09 wrote:
>>>>
>>>>>
>>>>>How can you draw a conclusion about Shanghai, you haven't even compared it head on with Nehalem yourself.
>>>>>
>>>>>Vincent
>>>>>
>>>>
>>>>David did characterize, at the beginning of the article, that Shanghai would be
>>>>fairly characterize as slightly lagging harpertown, in that it falls behind in some
>>>>cases, achieves parity and others, and has some strong points.
>>>>
>>>>Considering that is roughly a good assessment, then it can be extrapolated that Nehalem has opened up a wide margin.
>>>>
>>>>Nonentheless, you can search the databases yourself, the 5570 DP Xeon can range
>>>>anywhere from 1.5 to 2x faster than a Shanghai 2P (2.7 GHz). I have not found one
>>>>where Shanghai even comes close. This does not make Shanghai a bad CPU, but it
>>>>does make it tough for AMD to market Shanghai against Nehalem.
>>>
>>>Sales ballony, based upon a few cracked spec tests.
>>>
>>>Both are nearly identical processors in performance for the software we tried.
>>>
>>>Of course HT and turboboost turned off, and Shanghai a tad higher clocked than
>>>E versions of Xeon, gives Shanghai a slight edge in clockrate 2.53Ghz vs 2.7 shanghai.
>>>Of course with more powerbudget intel clocks higher.
>>>
>>
>>BTW, AMD submitted SpecJbb scores for 2.9GHz Shanghai. In the past it was the
>>indication that next lower clocked part, i,e, 2.8GHz, will soon be available in
>>normal thermal envelop. So there is a hope for 2.8GHz 75W Shanghai coming.
>>
>>>If you look to the intel documents in what i7 can execute it is SSE2+ instructions
>>>a cycle max. That gives 8 flops as a max, with or without HT. Is that so much higher than AMD?
>>>
>>>Multiplication is not faster than at AMD in throughput, in fact if you try latencies
>>>of AMD are better, so a good programmer CAN be faster at AMD.
>>>
>>
>>Let's follow you own logic. Floating-point addition is faster (=had shorter latency)
>>on Intel. Should we conclude that "a good programmer CAN be faster at Intel".
>
>There is 1 unit that is doing multiplication,
>there is a lot that can do addition.
>
>Addition has a latency of 0.5 cycle at intel so to speak and 0.33 cycle or so at
>AMD (i could be off by 0.17 or so as i checked the i7 handbooks quickly a while
>ago for all kind of stuff, not the AMD ones).
>
Oh, I forgot that you don't that you don't understand the difference between latency and throughput.
Sorry.
But I didn't know that you don't understand the difference between integer and floating point addition :(
Or may be you just plain don't know that bot Intel and AMD processors have only one (admittedly, wide) FP_ADD unit - 3-clock latency on Intel, 4 on AMD?
>Multiplication is important for FFT
Do you happen to know that Radix-2 butterfly consists of 6 additions ans 4 multiplications? Do you happen to know that each dependency chain in radix-2 butterfly includes 1 multiplication and 2 additions?
>and matrix calculations.
Where only addition is a part of long dependency chain so unrolling requirements (# of accumulators) depend solely on the latency of FP addition.
>Adding goes rather
>quick. Enough units to do it. Just 1 for multiplication.
Why do you insist on talking about things you have zero clue about?
Topic | Posted By | Date |
---|---|---|
Nehalem review up | David Kanter | 2009/04/07 03:43 AM |
Nehalem review up | noone | 2009/04/07 06:48 AM |
Strange jbb on Harpertown | Henrik S | 2009/04/07 08:29 AM |
Strange jbb on Harpertown | David Kanter | 2009/04/07 11:19 AM |
Strange jbb on Harpertown | Henrik S | 2009/04/07 09:33 PM |
Strange jbb on Harpertown | Chris | 2009/04/08 12:54 AM |
Strange jbb on Harpertown | Henrik S | 2009/04/08 02:40 AM |
Nehalem review up | Vincent Diepeveen | 2009/04/07 08:34 AM |
Nehalem review up | Jack | 2009/04/09 04:51 PM |
Nehalem review up | Vincent Diepeveen | 2009/04/10 01:58 AM |
Nehalem review up | Michael S | 2009/04/10 03:45 AM |
Nehalem review up | EduardoS | 2009/04/10 07:01 AM |
Nehalem review up | Michael S | 2009/04/10 07:56 AM |
Nehalem review up | Eugene Nalimov | 2009/04/10 09:12 AM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/10 10:10 AM |
Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/10 02:49 PM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/11 07:13 AM |
Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/11 11:30 AM |
Large pages | David Kanter | 2009/04/11 02:02 PM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/11 11:06 PM |
Choice of C compiler doesn't matter much for Java... | Paul | 2009/04/12 01:53 AM |
Choice of C compiler doesn't matter much for Java... | iz | 2009/04/12 02:59 AM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/12 07:37 AM |
Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/12 08:08 AM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/12 09:25 AM |
Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/12 05:24 PM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/12 10:18 PM |
Thread costs | David Kanter | 2009/04/13 12:12 AM |
Thread costs | Henrik S | 2009/04/14 02:08 PM |
Choice of C compiler doesn't matter much for Java... | Michael S | 2009/04/11 08:53 AM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/11 11:08 PM |
Nehalem review up | Vincent Diepeveen | 2009/04/11 04:50 PM |
Nehalem review up | Michael S | 2009/04/11 05:12 PM |
Nehalem review up | Vincent Diepeveen | 2009/04/12 03:01 AM |
Nehalem review up | Michael S | 2009/04/12 05:07 AM |
Nehalem review up | rwessel | 2009/04/07 02:01 PM |
Nehalem review up | slacker | 2009/04/08 09:11 AM |
Energy vs. power | David Kanter | 2009/04/08 10:11 AM |
Energy vs. power | Vincent Diepeveen | 2009/04/10 02:08 AM |
Energy vs. power | slacker | 2009/04/10 09:26 AM |
Energy vs. power | RagingDragon | 2009/04/10 10:19 AM |
Energy vs. power | David Kanter | 2009/04/10 11:47 AM |
Energy vs. power | Jack | 2009/04/10 04:44 PM |
Energy vs. power | slacker | 2009/04/10 07:00 PM |
Energy vs. power | Jack | 2009/04/10 07:31 PM |
Energy vs. power | David Kanter | 2009/04/11 12:16 AM |
Nehalem review up | rwessel | 2009/04/08 02:32 PM |
Minor font issue | gpriatko | 2009/04/07 04:35 PM |
Minor HTML issue | David Kanter | 2009/04/07 09:38 PM |
Minor HTML issue | David Kanter | 2009/04/07 09:39 PM |
Good work, i look forward to linux and SP2 numbers (NT) | PiedPiper | 2009/04/08 01:52 AM |
Nehalem review up | Joe Chang | 2009/04/10 03:59 AM |