By: juanrga (nospam.delete@this.juanrga.com), October 30, 2015 1:00 pm
Room: Moderated Discussions
lurker (lurker9000.delete@this.realemail.mail) on October 30, 2015 5:56 am wrote:
> dmcq (dmcq.delete@this.fano.co.uk) on October 30, 2015 5:12 am wrote:
> > I think 256 bits would be better as you can do four double precision operations at once and that is quite
> > common. On the other hand with four SIMD units instead one could merge two operations to give an effective
> > two by 256 bit units except for some special operations.
>
> 256 bits would be ideal, but I don't think 128 bits is going to be terrible for general
> applications. A lot of software don't even use AVX instructions. And as you said the
> four pipes can merge to execute 256 bit ops so at least that's something.
>
> > For anything larger they'd probably be better
> > off relying on GPUs I think if they can get the coherence and message passing working well. I can see
> > how to save larger register sets without impacting interrupt handling too badly but it seems a lot of
> > work when ARM is probably hoping to move 64 bit ARM into the embedded processor market.
>
> That reminds me, isn't AMD working on an HPC APU? Perhaps they decided that more intense workloads
> will be offloaded to the GPU anyway so they didn't bother that much with beefier FPU?
There are difficulties with offloading computations. Moreover, some time ago engineers shared the next concept APU for HPC

with two 256-bit FMA units could be fused into a 512-bit unit.
The fact AMD Zen provides 256-bit AVX compatibility via fusing 2x 128bit units shows that they consider 256-bit software is relevant enough to be supported.
The real reason why AMD Zen is stuck with 128bit units can be cache/memory bottlenecks or Skybridge (pin and inside compatibility of both K12 and Zen) or simply saving die space or .
> dmcq (dmcq.delete@this.fano.co.uk) on October 30, 2015 5:12 am wrote:
> > I think 256 bits would be better as you can do four double precision operations at once and that is quite
> > common. On the other hand with four SIMD units instead one could merge two operations to give an effective
> > two by 256 bit units except for some special operations.
>
> 256 bits would be ideal, but I don't think 128 bits is going to be terrible for general
> applications. A lot of software don't even use AVX instructions. And as you said the
> four pipes can merge to execute 256 bit ops so at least that's something.
>
> > For anything larger they'd probably be better
> > off relying on GPUs I think if they can get the coherence and message passing working well. I can see
> > how to save larger register sets without impacting interrupt handling too badly but it seems a lot of
> > work when ARM is probably hoping to move 64 bit ARM into the embedded processor market.
>
> That reminds me, isn't AMD working on an HPC APU? Perhaps they decided that more intense workloads
> will be offloaded to the GPU anyway so they didn't bother that much with beefier FPU?
There are difficulties with offloading computations. Moreover, some time ago engineers shared the next concept APU for HPC

with two 256-bit FMA units could be fused into a 512-bit unit.
The fact AMD Zen provides 256-bit AVX compatibility via fusing 2x 128bit units shows that they consider 256-bit software is relevant enough to be supported.
The real reason why AMD Zen is stuck with 128bit units can be cache/memory bottlenecks or Skybridge (pin and inside compatibility of both K12 and Zen) or simply saving die space or .