Article: Parallelism at HotPar 2010
By: Gabriele Svelto (gabriele.svelto.delete@this.gmail.com), August 5, 2010 6:00 am
Room: Moderated Discussions
Richard Cownie (tich@pobox.com) on 8/5/10 wrote:
---------------------------
>So your position is that Nehalem doesn't offer a lot more
>per-clock performance than Penryn, except when you're using
>the memory heavily and/or exploiting the extra threads
>and extra L2 cache bandwidth, or doing something else
>that Nehalem is good at ? That's so vague as to be hardly
>worth arguing with. AFAIK Nehalem makes *everything* go
>faster, and makes a lot of apps go a *lot* faster.
The topic was single-threaded execution, it seems pretty clear to me. So sure, Nehalem makes a *lot* of scalable, parallel applications go faster but that's not the point I was debating. If you happen to care about single-threaded execution it offers little or no improvement over Penryn because Intel deliberately decided to focus on throughput which was the point of my first post. And BTW the extra bandwidth usually doesn't buy you much as there are very few single-threaded workloads which are bottlenecked by bandwidth anyway.
---------------------------
>So your position is that Nehalem doesn't offer a lot more
>per-clock performance than Penryn, except when you're using
>the memory heavily and/or exploiting the extra threads
>and extra L2 cache bandwidth, or doing something else
>that Nehalem is good at ? That's so vague as to be hardly
>worth arguing with. AFAIK Nehalem makes *everything* go
>faster, and makes a lot of apps go a *lot* faster.
The topic was single-threaded execution, it seems pretty clear to me. So sure, Nehalem makes a *lot* of scalable, parallel applications go faster but that's not the point I was debating. If you happen to care about single-threaded execution it offers little or no improvement over Penryn because Intel deliberately decided to focus on throughput which was the point of my first post. And BTW the extra bandwidth usually doesn't buy you much as there are very few single-threaded workloads which are bottlenecked by bandwidth anyway.