By: Doug S (foo.delete@this.bar.bar), August 28, 2022 10:17 am
Room: Moderated Discussions
Eric Fink (eric.delete@this.anon.com) on August 28, 2022 1:22 am wrote:
> Kara (karaardalan.delete@this.gmail.com) on August 27, 2022 1:00 pm wrote:
> > Good thing GPUs don't have a bpu.
>
> They will need something similar anyway. Just instead of predicting branches you'd need to predict/schedule/reorder
> threads based on their memory access patterns. That's the key to high-performant, efficient raytracing
> on the GPU. Doesn't Nvidia already use something like that in their RT approach?
In addition, GPUs solve different problems. GPUs don't run operating systems, GUIs, compilers, databases and similar codebases that have lots of twisty code with random branches and short average loop lengths.
I don't know what branching looks like for rendering, but when a GPU is used as a GPGPU the number crunching code they're running typically has long loops where predicting all branches as taken might not get all that bad of a result. The gain from having a state of the art branch predictor may not be worth the expense in die area and power at least for a GPGPU, versus using it for more cores and increased parallelism.
> Kara (karaardalan.delete@this.gmail.com) on August 27, 2022 1:00 pm wrote:
> > Good thing GPUs don't have a bpu.
>
> They will need something similar anyway. Just instead of predicting branches you'd need to predict/schedule/reorder
> threads based on their memory access patterns. That's the key to high-performant, efficient raytracing
> on the GPU. Doesn't Nvidia already use something like that in their RT approach?
In addition, GPUs solve different problems. GPUs don't run operating systems, GUIs, compilers, databases and similar codebases that have lots of twisty code with random branches and short average loop lengths.
I don't know what branching looks like for rendering, but when a GPU is used as a GPGPU the number crunching code they're running typically has long loops where predicting all branches as taken might not get all that bad of a result. The gain from having a state of the art branch predictor may not be worth the expense in die area and power at least for a GPGPU, versus using it for more cores and increased parallelism.