Article: Parallelism at HotPar 2010
By: Michael S (already5chosen.delete@this.yahoo.com), August 5, 2010 12:57 am
Room: Moderated Discussions
Richard Cownie (tich@pobox.com) on 8/4/10 wrote:
---------------------------
>Gabriele Svelto (gabriele.svelto@gmail.com) on 8/4/10 wrote:
>---------------------------
>>That's debatable, Nehalem doesn't seem to offer much improvement in per-core performance
>>over Core 2 (in my experience at least)
>
>I daresay there are particular examples for which that
>is true. But my own experience with a big app is
>completely the opposite: just a couple of weeks ago
>I ran the exact same executable on a Core2 Xeon 2.93GHz
>and a Nehalem Xeon 2.93GHz, and got 1.51x speedup.
>And these are still 45nm parts without the TurboBoost
>trick.
>
>That's a pure single-threaded app, so there's no benefit
>from the hyperthreading.
>
>It seems like a really big win. And I get the impression
>that most people see Nehalem that way.
>
>You're welcome to have your opinion, based on your own
>experience. But I don't think it matches what most
>people have measured.
>
In addition to the previous post - there are also significant differences between various C2D/C2Q chipsets, esp. in main memory latency.
High-end server/workstation 5400 chipset as well as its older brothers 5000P and 5000X are the slowest.
Desktop chipsets are the fastest, especially the newest "extreme" X48.
Low-end/mainstream/HPC server 5100MCH chipset is in the middle.
All my performance comparisons in a post above are based on fast, although not the fastest, desktop chipsets. Quite possibly, were I measuring C2D with throughput-is-the-king-latency-goes-to-hell 5000P chipset the comparison will look more favorable to Nehalem.
---------------------------
>Gabriele Svelto (gabriele.svelto@gmail.com) on 8/4/10 wrote:
>---------------------------
>>That's debatable, Nehalem doesn't seem to offer much improvement in per-core performance
>>over Core 2 (in my experience at least)
>
>I daresay there are particular examples for which that
>is true. But my own experience with a big app is
>completely the opposite: just a couple of weeks ago
>I ran the exact same executable on a Core2 Xeon 2.93GHz
>and a Nehalem Xeon 2.93GHz, and got 1.51x speedup.
>And these are still 45nm parts without the TurboBoost
>trick.
>
>That's a pure single-threaded app, so there's no benefit
>from the hyperthreading.
>
>It seems like a really big win. And I get the impression
>that most people see Nehalem that way.
>
>You're welcome to have your opinion, based on your own
>experience. But I don't think it matches what most
>people have measured.
>
In addition to the previous post - there are also significant differences between various C2D/C2Q chipsets, esp. in main memory latency.
High-end server/workstation 5400 chipset as well as its older brothers 5000P and 5000X are the slowest.
Desktop chipsets are the fastest, especially the newest "extreme" X48.
Low-end/mainstream/HPC server 5100MCH chipset is in the middle.
All my performance comparisons in a post above are based on fast, although not the fastest, desktop chipsets. Quite possibly, were I measuring C2D with throughput-is-the-king-latency-goes-to-hell 5000P chipset the comparison will look more favorable to Nehalem.