By: juanrga (nospam.delete@this.juanrga.com), January 10, 2015 8:21 pm
Room: Moderated Discussions
Exophase (exophase.delete@this.gmail.com) on January 10, 2015 6:49 pm wrote:
> juanrga (nospam.delete@this.juanrga.com) on January 10, 2015 6:32 pm wrote:
> > Any GPU that I know fits the manycore definition.
>
> There is no hard manycore definition so you can pretty much say what you want. But it's not
> useful to consider GPUs or Xeon Phi as the same class of devices as something like Parallela
> that scales to hundreds (or thousands) of fully independent, more or less clasical small scalar
> in-order cores. Cores with very wide vector processors and many in-flight threads are better
> at a different set of problems than big sets of small independent cores are.
>
> But the worst thing is when people call vector lanes individual cores.
>
> Yes, GPUs and Xeon Phi do still tend to have a few times or even an order of magnitude more cores than the
> heavily throughput optimized alternatives like Haswell or Power8. That's true even normalizing for the die
> area difference between the highest end desktop GPUs and highest end desktop CPUs. But that doesn't they
> were optimizing core count. More likely the core size just ended up as it did as a natural result of scaling
> a useful DLP optimized to a comfortable point before starting to get too diminished returns.
>
There are different kind of manycores; some are specialized on a narrow subset of applications, whereas others are more general purpose; some are more close to multicores than others; some use scalar cores, others use 3-way VLIW cores; there are homogeneous manycores, there are heterogeneous manycores...
The class is very rich in its diversity, but this is not different to everything else: there are different kind of CPUs, of cars, of people...
Traditional GPUs from Nvidia/AMD differ in several ways from parallella, but TBR (Tile-Based Rendering) GPUs must be a bit more close.
Manycores are not defined by number of cores, but by the kind of compute optimization. The higher core count in GPUs and Phi compared to multicores like Haswell or Power8 is a result, a byproduct, of that microarchitectural optimization.
> juanrga (nospam.delete@this.juanrga.com) on January 10, 2015 6:32 pm wrote:
> > Any GPU that I know fits the manycore definition.
>
> There is no hard manycore definition so you can pretty much say what you want. But it's not
> useful to consider GPUs or Xeon Phi as the same class of devices as something like Parallela
> that scales to hundreds (or thousands) of fully independent, more or less clasical small scalar
> in-order cores. Cores with very wide vector processors and many in-flight threads are better
> at a different set of problems than big sets of small independent cores are.
>
> But the worst thing is when people call vector lanes individual cores.
>
> Yes, GPUs and Xeon Phi do still tend to have a few times or even an order of magnitude more cores than the
> heavily throughput optimized alternatives like Haswell or Power8. That's true even normalizing for the die
> area difference between the highest end desktop GPUs and highest end desktop CPUs. But that doesn't they
> were optimizing core count. More likely the core size just ended up as it did as a natural result of scaling
> a useful DLP optimized to a comfortable point before starting to get too diminished returns.
>
There are different kind of manycores; some are specialized on a narrow subset of applications, whereas others are more general purpose; some are more close to multicores than others; some use scalar cores, others use 3-way VLIW cores; there are homogeneous manycores, there are heterogeneous manycores...
The class is very rich in its diversity, but this is not different to everything else: there are different kind of CPUs, of cars, of people...
Traditional GPUs from Nvidia/AMD differ in several ways from parallella, but TBR (Tile-Based Rendering) GPUs must be a bit more close.
Manycores are not defined by number of cores, but by the kind of compute optimization. The higher core count in GPUs and Phi compared to multicores like Haswell or Power8 is a result, a byproduct, of that microarchitectural optimization.