Article: Nehalem Performance Preview
By: Michael S (already5chosen.delete@this.yahoo.com), April 11, 2009 4:12 pm
Room: Moderated Discussions
Vincent Diepeveen (diep@xs4all.nl) on 4/11/09 wrote:
---------------------------
>Michael S (already5chosen@yahoo.com) on 4/10/09 wrote:
>---------------------------
>>Vincent Diepeveen (diep@xs4all.nl) on 4/10/09 wrote:
>>---------------------------
>>>Jack (jumpingjack6@verizon.net) on 4/9/09 wrote:
>>>---------------------------
>>>>Vincent Diepeveen (diep@xs4all.nl) on 4/7/09 wrote:
>>>>
>>>>>
>>>>>How can you draw a conclusion about Shanghai, you haven't even compared it head on with Nehalem yourself.
>>>>>
>>>>>Vincent
>>>>>
>>>>
>>>>David did characterize, at the beginning of the article, that Shanghai would be
>>>>fairly characterize as slightly lagging harpertown, in that it falls behind in some
>>>>cases, achieves parity and others, and has some strong points.
>>>>
>>>>Considering that is roughly a good assessment, then it can be extrapolated that Nehalem has opened up a wide margin.
>>>>
>>>>Nonentheless, you can search the databases yourself, the 5570 DP Xeon can range
>>>>anywhere from 1.5 to 2x faster than a Shanghai 2P (2.7 GHz). I have not found one
>>>>where Shanghai even comes close. This does not make Shanghai a bad CPU, but it
>>>>does make it tough for AMD to market Shanghai against Nehalem.
>>>
>>>Sales ballony, based upon a few cracked spec tests.
>>>
>>>Both are nearly identical processors in performance for the software we tried.
>>>
>>>Of course HT and turboboost turned off, and Shanghai a tad higher clocked than
>>>E versions of Xeon, gives Shanghai a slight edge in clockrate 2.53Ghz vs 2.7 shanghai.
>>>Of course with more powerbudget intel clocks higher.
>>>
>>
>>BTW, AMD submitted SpecJbb scores for 2.9GHz Shanghai. In the past it was the
>>indication that next lower clocked part, i,e, 2.8GHz, will soon be available in
>>normal thermal envelop. So there is a hope for 2.8GHz 75W Shanghai coming.
>>
>>>If you look to the intel documents in what i7 can execute it is SSE2+ instructions
>>>a cycle max. That gives 8 flops as a max, with or without HT. Is that so much higher than AMD?
>>>
>>>Multiplication is not faster than at AMD in throughput, in fact if you try latencies
>>>of AMD are better, so a good programmer CAN be faster at AMD.
>>>
>>
>>Let's follow you own logic. Floating-point addition is faster (=had shorter latency)
>>on Intel. Should we conclude that "a good programmer CAN be faster at Intel".
>
>There is 1 unit that is doing multiplication,
>there is a lot that can do addition.
>
>Addition has a latency of 0.5 cycle at intel so to speak and 0.33 cycle or so at
>AMD (i could be off by 0.17 or so as i checked the i7 handbooks quickly a while
>ago for all kind of stuff, not the AMD ones).
>
Oh, I forgot that you don't that you don't understand the difference between latency and throughput.
Sorry.
But I didn't know that you don't understand the difference between integer and floating point addition :(
Or may be you just plain don't know that bot Intel and AMD processors have only one (admittedly, wide) FP_ADD unit - 3-clock latency on Intel, 4 on AMD?
>Multiplication is important for FFT
Do you happen to know that Radix-2 butterfly consists of 6 additions ans 4 multiplications? Do you happen to know that each dependency chain in radix-2 butterfly includes 1 multiplication and 2 additions?
>and matrix calculations.
Where only addition is a part of long dependency chain so unrolling requirements (# of accumulators) depend solely on the latency of FP addition.
>Adding goes rather
>quick. Enough units to do it. Just 1 for multiplication.
Why do you insist on talking about things you have zero clue about?
---------------------------
>Michael S (already5chosen@yahoo.com) on 4/10/09 wrote:
>---------------------------
>>Vincent Diepeveen (diep@xs4all.nl) on 4/10/09 wrote:
>>---------------------------
>>>Jack (jumpingjack6@verizon.net) on 4/9/09 wrote:
>>>---------------------------
>>>>Vincent Diepeveen (diep@xs4all.nl) on 4/7/09 wrote:
>>>>
>>>>>
>>>>>How can you draw a conclusion about Shanghai, you haven't even compared it head on with Nehalem yourself.
>>>>>
>>>>>Vincent
>>>>>
>>>>
>>>>David did characterize, at the beginning of the article, that Shanghai would be
>>>>fairly characterize as slightly lagging harpertown, in that it falls behind in some
>>>>cases, achieves parity and others, and has some strong points.
>>>>
>>>>Considering that is roughly a good assessment, then it can be extrapolated that Nehalem has opened up a wide margin.
>>>>
>>>>Nonentheless, you can search the databases yourself, the 5570 DP Xeon can range
>>>>anywhere from 1.5 to 2x faster than a Shanghai 2P (2.7 GHz). I have not found one
>>>>where Shanghai even comes close. This does not make Shanghai a bad CPU, but it
>>>>does make it tough for AMD to market Shanghai against Nehalem.
>>>
>>>Sales ballony, based upon a few cracked spec tests.
>>>
>>>Both are nearly identical processors in performance for the software we tried.
>>>
>>>Of course HT and turboboost turned off, and Shanghai a tad higher clocked than
>>>E versions of Xeon, gives Shanghai a slight edge in clockrate 2.53Ghz vs 2.7 shanghai.
>>>Of course with more powerbudget intel clocks higher.
>>>
>>
>>BTW, AMD submitted SpecJbb scores for 2.9GHz Shanghai. In the past it was the
>>indication that next lower clocked part, i,e, 2.8GHz, will soon be available in
>>normal thermal envelop. So there is a hope for 2.8GHz 75W Shanghai coming.
>>
>>>If you look to the intel documents in what i7 can execute it is SSE2+ instructions
>>>a cycle max. That gives 8 flops as a max, with or without HT. Is that so much higher than AMD?
>>>
>>>Multiplication is not faster than at AMD in throughput, in fact if you try latencies
>>>of AMD are better, so a good programmer CAN be faster at AMD.
>>>
>>
>>Let's follow you own logic. Floating-point addition is faster (=had shorter latency)
>>on Intel. Should we conclude that "a good programmer CAN be faster at Intel".
>
>There is 1 unit that is doing multiplication,
>there is a lot that can do addition.
>
>Addition has a latency of 0.5 cycle at intel so to speak and 0.33 cycle or so at
>AMD (i could be off by 0.17 or so as i checked the i7 handbooks quickly a while
>ago for all kind of stuff, not the AMD ones).
>
Oh, I forgot that you don't that you don't understand the difference between latency and throughput.
Sorry.
But I didn't know that you don't understand the difference between integer and floating point addition :(
Or may be you just plain don't know that bot Intel and AMD processors have only one (admittedly, wide) FP_ADD unit - 3-clock latency on Intel, 4 on AMD?
>Multiplication is important for FFT
Do you happen to know that Radix-2 butterfly consists of 6 additions ans 4 multiplications? Do you happen to know that each dependency chain in radix-2 butterfly includes 1 multiplication and 2 additions?
>and matrix calculations.
Where only addition is a part of long dependency chain so unrolling requirements (# of accumulators) depend solely on the latency of FP addition.
>Adding goes rather
>quick. Enough units to do it. Just 1 for multiplication.
Why do you insist on talking about things you have zero clue about?
Topic | Posted By | Date |
---|---|---|
Nehalem review up | David Kanter | 2009/04/07 02:43 AM |
Nehalem review up | noone | 2009/04/07 05:48 AM |
Strange jbb on Harpertown | Henrik S | 2009/04/07 07:29 AM |
Strange jbb on Harpertown | David Kanter | 2009/04/07 10:19 AM |
Strange jbb on Harpertown | Henrik S | 2009/04/07 08:33 PM |
Strange jbb on Harpertown | Chris | 2009/04/07 11:54 PM |
Strange jbb on Harpertown | Henrik S | 2009/04/08 01:40 AM |
Nehalem review up | Vincent Diepeveen | 2009/04/07 07:34 AM |
Nehalem review up | Jack | 2009/04/09 03:51 PM |
Nehalem review up | Vincent Diepeveen | 2009/04/10 12:58 AM |
Nehalem review up | Michael S | 2009/04/10 02:45 AM |
Nehalem review up | EduardoS | 2009/04/10 06:01 AM |
Nehalem review up | Michael S | 2009/04/10 06:56 AM |
Nehalem review up | Eugene Nalimov | 2009/04/10 08:12 AM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/10 09:10 AM |
Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/10 01:49 PM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/11 06:13 AM |
Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/11 10:30 AM |
Large pages | David Kanter | 2009/04/11 01:02 PM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/11 10:06 PM |
Choice of C compiler doesn't matter much for Java... | Paul | 2009/04/12 12:53 AM |
Choice of C compiler doesn't matter much for Java... | iz | 2009/04/12 01:59 AM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/12 06:37 AM |
Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/12 07:08 AM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/12 08:25 AM |
Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/12 04:24 PM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/12 09:18 PM |
Thread costs | David Kanter | 2009/04/12 11:12 PM |
Thread costs | Henrik S | 2009/04/14 01:08 PM |
Choice of C compiler doesn't matter much for Java... | Michael S | 2009/04/11 07:53 AM |
Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/11 10:08 PM |
Nehalem review up | Vincent Diepeveen | 2009/04/11 03:50 PM |
Nehalem review up | Michael S | 2009/04/11 04:12 PM |
Nehalem review up | Vincent Diepeveen | 2009/04/12 02:01 AM |
Nehalem review up | Michael S | 2009/04/12 04:07 AM |
Nehalem review up | rwessel | 2009/04/07 01:01 PM |
Nehalem review up | slacker | 2009/04/08 08:11 AM |
Energy vs. power | David Kanter | 2009/04/08 09:11 AM |
Energy vs. power | Vincent Diepeveen | 2009/04/10 01:08 AM |
Energy vs. power | slacker | 2009/04/10 08:26 AM |
Energy vs. power | RagingDragon | 2009/04/10 09:19 AM |
Energy vs. power | David Kanter | 2009/04/10 10:47 AM |
Energy vs. power | Jack | 2009/04/10 03:44 PM |
Energy vs. power | slacker | 2009/04/10 06:00 PM |
Energy vs. power | Jack | 2009/04/10 06:31 PM |
Energy vs. power | David Kanter | 2009/04/10 11:16 PM |
Nehalem review up | rwessel | 2009/04/08 01:32 PM |
Minor font issue | gpriatko | 2009/04/07 03:35 PM |
Minor HTML issue | David Kanter | 2009/04/07 08:38 PM |
Minor HTML issue | David Kanter | 2009/04/07 08:39 PM |
Good work, i look forward to linux and SP2 numbers (NT) | PiedPiper | 2009/04/08 12:52 AM |
Nehalem review up | Joe Chang | 2009/04/10 02:59 AM |