By: Paul A. Clayton (paaronclayton.delete@this.gmail.com), July 27, 2012 10:31 am
Room: Moderated Discussions
jp (jipe4153.delete@this.gmail.com) on July 27, 2012 9:47 am wrote:
> aaron spink (aaronspink.delete@this.notearthlink.net) on July 27, 2012 9:36 am
> wrote:
[snip]
>> GPUs have high local bandwidth but rather poor global
>> bandwidth. Not to mention rather limited capacity.
>
> Global bandwidth on GPUs is much higher than on any other CPU ( 5-6 times
> higher), you seem to have no backing for your claim.
Are there GPUs with more than 16 lanes of PCIe?
One 3.2 GHz QPI link might have comparable bandwidth to 16 lanes of PCIe 3.0 (6.4 GT/s * 20 lanes vs. 8 GT/s * 16), the higher-end Intel processors have more than one QPI link.
I assume you call "global" the what Aaron Spink calls "local", perhaps using the the OpenCL terminology of global memory and local memory. Aaron Spink comes from a CPU/NUMA system background (IIRC he worked on Alpha EV7 coherence) where global memory refers to the entire system memory and local memory refers to the memory attached directly to a single node.
> aaron spink (aaronspink.delete@this.notearthlink.net) on July 27, 2012 9:36 am
> wrote:
[snip]
>> GPUs have high local bandwidth but rather poor global
>> bandwidth. Not to mention rather limited capacity.
>
> Global bandwidth on GPUs is much higher than on any other CPU ( 5-6 times
> higher), you seem to have no backing for your claim.
Are there GPUs with more than 16 lanes of PCIe?
One 3.2 GHz QPI link might have comparable bandwidth to 16 lanes of PCIe 3.0 (6.4 GT/s * 20 lanes vs. 8 GT/s * 16), the higher-end Intel processors have more than one QPI link.
I assume you call "global" the what Aaron Spink calls "local", perhaps using the the OpenCL terminology of global memory and local memory. Aaron Spink comes from a CPU/NUMA system background (IIRC he worked on Alpha EV7 coherence) where global memory refers to the entire system memory and local memory refers to the memory attached directly to a single node.



