Article: Parallelism at HotPar 2010
By: Gabriele Svelto (gabriele.svelto.delete@this.gmail.com), August 3, 2010 12:10 am
Room: Moderated Discussions
Richard Cownie (tich@pobox.com) on 8/2/10 wrote:
---------------------------
>What *might* break this cycle is the rise of powerful
>integrated GPUs in Core i5, SandyBridge, and Ontario/
>Llano. It's much more interesting to optimize your
>app for hardware that 80% of users will have, than to
>optimize for hardware that only 40% will have. But
>that only applies if the integrated GPUs really do go
>faster than the CPUs for a wide-enough range of apps.
For GPGPU computing I find integrated solutions more interesting than discrete GPUs. The reason is that with an IGP you will basically have a wide vector processor hooked to your main memory pool which guarantees quick communication and little overhead (with the appropriate abstractions you might even be able to reuse the same memory buffers without having to copy them around). Lower overhead accessing this hardware will open it up to more problems compared to the current situation where you need a significant speed-up over CPU processing to justify the cost of copying data over to the GPU and back to memory.
---------------------------
>What *might* break this cycle is the rise of powerful
>integrated GPUs in Core i5, SandyBridge, and Ontario/
>Llano. It's much more interesting to optimize your
>app for hardware that 80% of users will have, than to
>optimize for hardware that only 40% will have. But
>that only applies if the integrated GPUs really do go
>faster than the CPUs for a wide-enough range of apps.
For GPGPU computing I find integrated solutions more interesting than discrete GPUs. The reason is that with an IGP you will basically have a wide vector processor hooked to your main memory pool which guarantees quick communication and little overhead (with the appropriate abstractions you might even be able to reuse the same memory buffers without having to copy them around). Lower overhead accessing this hardware will open it up to more problems compared to the current situation where you need a significant speed-up over CPU processing to justify the cost of copying data over to the GPU and back to memory.