Article: Nehalem Performance Preview

By: Michael S (already5chosen.delete@this.yahoo.com), April 12, 2009 4:07 am

Room: Moderated Discussions

Vincent Diepeveen (diep@xs4all.nl) on 4/12/09 wrote:

---------------------------

>>

>>

>

>My qh library implements FFT for n bits FFT's total lossless, so without the usual

>FFT error that backtracks in floating point.

>

"Precise" FFT?

For any order other than 2 and 4 it sound like yet another BS.

"More precise than usual" FFT is possible but you would need to do multiple-precision arithmetics. Regular 64-bit integers are no better precision-wise than 80-bit FP, they are just slower.

>Apologies, Can you remind us which FFT codes you implemented?

I didn't have a need to implement FFT on PC since x87 days.

Back then it was rather normal floating-point FFT.

If I ever do it again it probably would be 4-way-interleaved or 8-way-interleaved kernel, i.e. 4 or 8 independent FFTs done at once. For small orders such kernel should be measurably faster than best available libraries that process channels one-by-one.

Despite the fact, that I didn't implement FFT for many years I very well remember the details of involved calculation.

>

>I do not see you post in math forums either about all this.

Can you imagine that I don't post on forums about 99% of what I am actually doing?

>

>Can you react onto the fact that i mentionned that there is a lot of execution

>units on the cpu that do addition,

There are a lot (if three is considered a lot) of EUs that do integer addition but only one EU that does floating-point addition. That's the same on AMD and Intel.

>and that multiplication is dead slow on intel compared to addition?

Integer multiplication is slower than integer addition because required electronic circuits are considerably more complex. The CPU that does integer multiplication and addition at equal speed would be horribly sub-optimal for nearly all workloads, even for those "heavy" on imul. That's why both Intel and AMD granted to integer addition both higher throughput and lower latency than to integer multiplication. But you should know all that, don't you?

Now, relatively to most of other remaining "high-end" CPU integer multiplication on both Intel and AMD is not slow, it rather fast, with exception of most complex and least common variant that produces 128-bit results. And, with exception of this particular variant, Intel implementation is either faster than AMD or they are equal.

>

>On the throughput speed of intel cpu's we can speak another time, it is not so fast.

>

---------------------------

>>

>>

>

>My qh library implements FFT for n bits FFT's total lossless, so without the usual

>FFT error that backtracks in floating point.

>

"Precise" FFT?

For any order other than 2 and 4 it sound like yet another BS.

"More precise than usual" FFT is possible but you would need to do multiple-precision arithmetics. Regular 64-bit integers are no better precision-wise than 80-bit FP, they are just slower.

>Apologies, Can you remind us which FFT codes you implemented?

I didn't have a need to implement FFT on PC since x87 days.

Back then it was rather normal floating-point FFT.

If I ever do it again it probably would be 4-way-interleaved or 8-way-interleaved kernel, i.e. 4 or 8 independent FFTs done at once. For small orders such kernel should be measurably faster than best available libraries that process channels one-by-one.

Despite the fact, that I didn't implement FFT for many years I very well remember the details of involved calculation.

>

>I do not see you post in math forums either about all this.

Can you imagine that I don't post on forums about 99% of what I am actually doing?

>

>Can you react onto the fact that i mentionned that there is a lot of execution

>units on the cpu that do addition,

There are a lot (if three is considered a lot) of EUs that do integer addition but only one EU that does floating-point addition. That's the same on AMD and Intel.

>and that multiplication is dead slow on intel compared to addition?

Integer multiplication is slower than integer addition because required electronic circuits are considerably more complex. The CPU that does integer multiplication and addition at equal speed would be horribly sub-optimal for nearly all workloads, even for those "heavy" on imul. That's why both Intel and AMD granted to integer addition both higher throughput and lower latency than to integer multiplication. But you should know all that, don't you?

Now, relatively to most of other remaining "high-end" CPU integer multiplication on both Intel and AMD is not slow, it rather fast, with exception of most complex and least common variant that produces 128-bit results. And, with exception of this particular variant, Intel implementation is either faster than AMD or they are equal.

>

>On the throughput speed of intel cpu's we can speak another time, it is not so fast.

>

Topic | Posted By | Date |
---|---|---|

Nehalem review up | David Kanter | 2009/04/07 02:43 AM |

Nehalem review up | noone | 2009/04/07 05:48 AM |

Strange jbb on Harpertown | Henrik S | 2009/04/07 07:29 AM |

Strange jbb on Harpertown | David Kanter | 2009/04/07 10:19 AM |

Strange jbb on Harpertown | Henrik S | 2009/04/07 08:33 PM |

Strange jbb on Harpertown | Chris | 2009/04/07 11:54 PM |

Strange jbb on Harpertown | Henrik S | 2009/04/08 01:40 AM |

Nehalem review up | Vincent Diepeveen | 2009/04/07 07:34 AM |

Nehalem review up | Jack | 2009/04/09 03:51 PM |

Nehalem review up | Vincent Diepeveen | 2009/04/10 12:58 AM |

Nehalem review up | Michael S | 2009/04/10 02:45 AM |

Nehalem review up | EduardoS | 2009/04/10 06:01 AM |

Nehalem review up | Michael S | 2009/04/10 06:56 AM |

Nehalem review up | Eugene Nalimov | 2009/04/10 08:12 AM |

Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/10 09:10 AM |

Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/10 01:49 PM |

Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/11 06:13 AM |

Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/11 10:30 AM |

Large pages | David Kanter | 2009/04/11 01:02 PM |

Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/11 10:06 PM |

Choice of C compiler doesn't matter much for Java... | Paul | 2009/04/12 12:53 AM |

Choice of C compiler doesn't matter much for Java... | iz | 2009/04/12 01:59 AM |

Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/12 06:37 AM |

Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/12 07:08 AM |

Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/12 08:25 AM |

Choice of C compiler doesn't matter much for Java... | EduardoS | 2009/04/12 04:24 PM |

Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/12 09:18 PM |

Thread costs | David Kanter | 2009/04/12 11:12 PM |

Thread costs | Henrik S | 2009/04/14 01:08 PM |

Choice of C compiler doesn't matter much for Java... | Michael S | 2009/04/11 07:53 AM |

Choice of C compiler doesn't matter much for Java... | Henrik S | 2009/04/11 10:08 PM |

Nehalem review up | Vincent Diepeveen | 2009/04/11 03:50 PM |

Nehalem review up | Michael S | 2009/04/11 04:12 PM |

Nehalem review up | Vincent Diepeveen | 2009/04/12 02:01 AM |

Nehalem review up | Michael S | 2009/04/12 04:07 AM |

Nehalem review up | rwessel | 2009/04/07 01:01 PM |

Nehalem review up | slacker | 2009/04/08 08:11 AM |

Energy vs. power | David Kanter | 2009/04/08 09:11 AM |

Energy vs. power | Vincent Diepeveen | 2009/04/10 01:08 AM |

Energy vs. power | slacker | 2009/04/10 08:26 AM |

Energy vs. power | RagingDragon | 2009/04/10 09:19 AM |

Energy vs. power | David Kanter | 2009/04/10 10:47 AM |

Energy vs. power | Jack | 2009/04/10 03:44 PM |

Energy vs. power | slacker | 2009/04/10 06:00 PM |

Energy vs. power | Jack | 2009/04/10 06:31 PM |

Energy vs. power | David Kanter | 2009/04/10 11:16 PM |

Nehalem review up | rwessel | 2009/04/08 01:32 PM |

Minor font issue | gpriatko | 2009/04/07 03:35 PM |

Minor HTML issue | David Kanter | 2009/04/07 08:38 PM |

Minor HTML issue | David Kanter | 2009/04/07 08:39 PM |

Good work, i look forward to linux and SP2 numbers (NT) | PiedPiper | 2009/04/08 12:52 AM |

Nehalem review up | Joe Chang | 2009/04/10 02:59 AM |