By: dmcq (dmcq.delete@this.fano.co.uk), May 13, 2022 4:24 pm
Room: Moderated Discussions
Doug S (foo.delete@this.bar.bar) on May 13, 2022 10:22 am wrote:
> --- (---.delete@this.redheron.com) on May 13, 2022 9:33 am wrote:
> > NEON (built into each core) uses 128b registers, and there are 4 NEON engines (so, in a handwaving
> > sense) 512b of NEON capability per cycle per P core, 256b of NEON capability per cycle per E core.
> > SVE2 is *probably* coming to Apple this year with A16 and M2, and will *probably*
> > feature 256b wide registers. But if you are evaluating SVE2 based on register
> > width, you're misunderstanding where the value of SVE2 lies.
>
>
> I'm curious why you expect to see SVE2 added to Apple's cores? What would be the benefit, when there is already
> NEON and AMX? Or to reverse the question, what do you think Apple would be losing by not having SVE2?
>
> If ARM plans to eventually make SVE2 a required part of a future iteration of ARMv9 rather
> than optional, it would make sense to add it sooner rather than later. That goes double
> if ARM plans to someday deprecate NEON and make it optional in the future.
>
> If SVE2 is going to remain optional for the foreseeable future, and NEON mandatory,
> it seems like Apple would be better off putting their resources into AMX.
The real question is whether they would have SVE2 wider than 128 bits. For 128 bits the actual extra hardware required would be fairly small on top of the Neon hardware. The main additions are the decoding, the predcate registers, masking, and the handling of the first fault register - nothing that would tax the Apple designers much, more a question of priorities than anything else. If they think 4 concurrent Neon operations are worthwhile I'd have thought they'd go for 256 bit SVE2 to compete with AVX if nothing else.
> --- (---.delete@this.redheron.com) on May 13, 2022 9:33 am wrote:
> > NEON (built into each core) uses 128b registers, and there are 4 NEON engines (so, in a handwaving
> > sense) 512b of NEON capability per cycle per P core, 256b of NEON capability per cycle per E core.
> > SVE2 is *probably* coming to Apple this year with A16 and M2, and will *probably*
> > feature 256b wide registers. But if you are evaluating SVE2 based on register
> > width, you're misunderstanding where the value of SVE2 lies.
>
>
> I'm curious why you expect to see SVE2 added to Apple's cores? What would be the benefit, when there is already
> NEON and AMX? Or to reverse the question, what do you think Apple would be losing by not having SVE2?
>
> If ARM plans to eventually make SVE2 a required part of a future iteration of ARMv9 rather
> than optional, it would make sense to add it sooner rather than later. That goes double
> if ARM plans to someday deprecate NEON and make it optional in the future.
>
> If SVE2 is going to remain optional for the foreseeable future, and NEON mandatory,
> it seems like Apple would be better off putting their resources into AMX.
The real question is whether they would have SVE2 wider than 128 bits. For 128 bits the actual extra hardware required would be fairly small on top of the Neon hardware. The main additions are the decoding, the predcate registers, masking, and the handling of the first fault register - nothing that would tax the Apple designers much, more a question of priorities than anything else. If they think 4 concurrent Neon operations are worthwhile I'd have thought they'd go for 256 bit SVE2 to compete with AVX if nothing else.