By: anon (anon.delete@this.anon.com), July 2, 2013 5:47 pm
Room: Moderated Discussions
Patrick Chase (patrickjchase.delete@this.gmail.com) on July 2, 2013 4:43 pm wrote:
> anon (anon.delete@this.anon.com) on July 2, 2013 4:12 pm wrote:
> > Patrick Chase (patrickjchase.delete@this.gmail.com) on July 2, 2013 10:03 am wrote:
> > > anon (anon.delete@this.anon.com) on July 2, 2013 7:13 am wrote in reference to GPUs:
> > > > Not due to this high level stuff, because the hardware itself is more efficient.
> > >
> > > No, it is not. It's simply optimized to do different things.
> >
> > I'm talking about raw ability to do floating point operations.
>
> Ad absurdium indeed :-)
Correct.
>
> > > A CPU is much
> > > more efficient than a GPU on many "irregular" and/or iterative workloads.
> > >
> > > > In terms of hardware, I don't know exactly.
> > >
> > > Wow, that doesn't seem to prevent you from having strong opinions on the topic.
> >
> > The numbers I use a data from the green 500 list. I have the opinion that GPUs and
> > vector oriented architectures are more efficient than short-SIMD GP CPUs, for this
> > workload.
>
> Green500 uses Linpack. That's about as GPU-friendly as it gets. It's
> arguably not even a representative workload for supercomputing.
Yes, so it's quite good for testing efficiency of doing simple flops with at least realistic memory access.
>
> > > You and Etienne are both off base, but Etienne is at least on the right path. If you're
> > > actually interested in learning then take a look through this presentation:
> > >
> > > http://s08.idav.ucdavis.edu/fatahalian-gpu-architecture.pdf
> > >
> > > It's fairly dated (i.e. the number are hilariously outdated
> > > some cases) but the concepts are presented correctly.
> >
> > This says nothing about whether GPU design will be more efficient than CPU design.
>
> It actually says quite a lot about that, for anybody with a basic understanding of microarchitecture.
>
> What is says is that GPUs are optimized for and extremely efficient at tasks with a very large number of
> independent, mostly-similar (low code divergence) work items.
Everyone knows that. What it does not say is *why* these structures and approaches are more efficient. Latency tolerance allowing lower clocks obviously, which is what I already mentioned. I'm sure there are many others from low level circuit design to microachitecture and uncore, which I don't know about.
> anon (anon.delete@this.anon.com) on July 2, 2013 4:12 pm wrote:
> > Patrick Chase (patrickjchase.delete@this.gmail.com) on July 2, 2013 10:03 am wrote:
> > > anon (anon.delete@this.anon.com) on July 2, 2013 7:13 am wrote in reference to GPUs:
> > > > Not due to this high level stuff, because the hardware itself is more efficient.
> > >
> > > No, it is not. It's simply optimized to do different things.
> >
> > I'm talking about raw ability to do floating point operations.
>
> Ad absurdium indeed :-)
Correct.
>
> > > A CPU is much
> > > more efficient than a GPU on many "irregular" and/or iterative workloads.
> > >
> > > > In terms of hardware, I don't know exactly.
> > >
> > > Wow, that doesn't seem to prevent you from having strong opinions on the topic.
> >
> > The numbers I use a data from the green 500 list. I have the opinion that GPUs and
> > vector oriented architectures are more efficient than short-SIMD GP CPUs, for this
> > workload.
>
> Green500 uses Linpack. That's about as GPU-friendly as it gets. It's
> arguably not even a representative workload for supercomputing.
Yes, so it's quite good for testing efficiency of doing simple flops with at least realistic memory access.
>
> > > You and Etienne are both off base, but Etienne is at least on the right path. If you're
> > > actually interested in learning then take a look through this presentation:
> > >
> > > http://s08.idav.ucdavis.edu/fatahalian-gpu-architecture.pdf
> > >
> > > It's fairly dated (i.e. the number are hilariously outdated
> > > some cases) but the concepts are presented correctly.
> >
> > This says nothing about whether GPU design will be more efficient than CPU design.
>
> It actually says quite a lot about that, for anybody with a basic understanding of microarchitecture.
>
> What is says is that GPUs are optimized for and extremely efficient at tasks with a very large number of
> independent, mostly-similar (low code divergence) work items.
Everyone knows that. What it does not say is *why* these structures and approaches are more efficient. Latency tolerance allowing lower clocks obviously, which is what I already mentioned. I'm sure there are many others from low level circuit design to microachitecture and uncore, which I don't know about.