Article: Parallelism at HotPar 2010
By: Richard Cownie (tich.delete@this.pobox.com), August 5, 2010 7:10 am
Room: Moderated Discussions
Gabriele Svelto (gabriele.svelto@gmail.com) on 8/5/10 wrote:
---------------------------
>The topic was single-threaded execution, it seems pretty clear to me. So sure,
>Nehalem makes a *lot* of scalable, parallel applications go faster but that's not
>the point I was debating. If you happen to care about single-threaded execution
>it offers little or no improvement over Penryn
If you're single-threaded, then you get the benefit of
the higher clock speed with TurboBoost. So you win
that way.
It seems to me that Nehalem really covers all the bases
quite well.
>because Intel deliberately decided
>to focus on throughput which was the point of my first post. And BTW the extra bandwidth
>usually doesn't buy you much as there are very few single-threaded workloads which are bottlenecked by bandwidth anyway.
There are a lot of single-threaded apps which benefit
from the much lower latency of DRAM accesses in Nehalem
systems. And didn't they reduce the L2 latency as well ?
If you've got a particular example in mind of an app
that goes better on Core2 than on Nehalem, tell us
what it is and give us some figures. Otherwise there's
not much to this nitpicking.
---------------------------
>The topic was single-threaded execution, it seems pretty clear to me. So sure,
>Nehalem makes a *lot* of scalable, parallel applications go faster but that's not
>the point I was debating. If you happen to care about single-threaded execution
>it offers little or no improvement over Penryn
If you're single-threaded, then you get the benefit of
the higher clock speed with TurboBoost. So you win
that way.
It seems to me that Nehalem really covers all the bases
quite well.
>because Intel deliberately decided
>to focus on throughput which was the point of my first post. And BTW the extra bandwidth
>usually doesn't buy you much as there are very few single-threaded workloads which are bottlenecked by bandwidth anyway.
There are a lot of single-threaded apps which benefit
from the much lower latency of DRAM accesses in Nehalem
systems. And didn't they reduce the L2 latency as well ?
If you've got a particular example in mind of an app
that goes better on Core2 than on Nehalem, tell us
what it is and give us some figures. Otherwise there's
not much to this nitpicking.