By: Gabriele Svelto (gabriele.svelto.delete@this.gmail.com), January 23, 2017 2:03 am
Room: Moderated Discussions
RichardC (tich.delete@this.pobox.com) on January 22, 2017 9:45 pm wrote:
> And I'm also assuming that such a machine would be optimized for embarrassingly-parallel
> apps which work ok with a large number of small(ish)-DRAM nodes, e.g. 8-16GB per node.
> If your problem has unfavorable communication/compute ratio when split across many small
> nodes, a smaller number of large-DRAM x86's is better. But I think CFD is a niche where
> the flock-of-chickens approach can work.
If a workload is amenable to the flock-of-chickens approach then there's a good chance it runs fine on GPUs (unless it has wildly divergent control-flow).
> And I'm also assuming that such a machine would be optimized for embarrassingly-parallel
> apps which work ok with a large number of small(ish)-DRAM nodes, e.g. 8-16GB per node.
> If your problem has unfavorable communication/compute ratio when split across many small
> nodes, a smaller number of large-DRAM x86's is better. But I think CFD is a niche where
> the flock-of-chickens approach can work.
If a workload is amenable to the flock-of-chickens approach then there's a good chance it runs fine on GPUs (unless it has wildly divergent control-flow).