By: Doug S (foo.delete@this.bar.bar), May 13, 2022 9:28 pm
Room: Moderated Discussions
--- (---.delete@this.redheron.com) on May 13, 2022 2:05 pm wrote:
> Doug S (foo.delete@this.bar.bar) on May 13, 2022 10:22 am wrote:
> > --- (---.delete@this.redheron.com) on May 13, 2022 9:33 am wrote:
> > > NEON (built into each core) uses 128b registers, and there are 4 NEON engines (so, in a handwaving
> > > sense) 512b of NEON capability per cycle per P core, 256b of NEON capability per cycle per E core.
> > > SVE2 is *probably* coming to Apple this year with A16 and M2, and will *probably*
> > > feature 256b wide registers. But if you are evaluating SVE2 based on register
> > > width, you're misunderstanding where the value of SVE2 lies.
> >
> >
> > I'm curious why you expect to see SVE2 added to Apple's cores?
> > What would be the benefit, when there is already
> > NEON and AMX? Or to reverse the question, what do you think Apple would be losing by not having SVE2?
> >
> > If ARM plans to eventually make SVE2 a required part of a future iteration of ARMv9 rather
> > than optional, it would make sense to add it sooner rather than later. That goes double
> > if ARM plans to someday deprecate NEON and make it optional in the future.
> >
> > If SVE2 is going to remain optional for the foreseeable future, and NEON mandatory,
> > it seems like Apple would be better off putting their resources into AMX.
>
> As I have a million times (but no-one pays any attention) SVE2 is not about wide
> vectors, it is about being a better compiler target for "general" loops.
> It will make Apple's CPUs faster, because it will allow more code to be vectorized,
> and vectorized at lower overhead, and that's why they will add it.
>
> AMX and SVE2 solve very different problems -- as I said in the first post.
What about NEON, which solves the exact same problem? SVE2 allows wider vectors, but unless you actually ship with significantly wider vectors what's the difference between e.g. 2x256b SVE2 and 4x128b NEON? Yeah SVE2 is where the development is taking place now, but most of the new instructions are 'AI' related stuff Apple supports via the NPU.
Now if they plan to offer 512 bit wide SVE2 on M2 Max for the higher end stuff while keeping it at a more reasonable 128 or 256 bit width for phones, tablets and lower end Macs maybe it makes sense.
> Doug S (foo.delete@this.bar.bar) on May 13, 2022 10:22 am wrote:
> > --- (---.delete@this.redheron.com) on May 13, 2022 9:33 am wrote:
> > > NEON (built into each core) uses 128b registers, and there are 4 NEON engines (so, in a handwaving
> > > sense) 512b of NEON capability per cycle per P core, 256b of NEON capability per cycle per E core.
> > > SVE2 is *probably* coming to Apple this year with A16 and M2, and will *probably*
> > > feature 256b wide registers. But if you are evaluating SVE2 based on register
> > > width, you're misunderstanding where the value of SVE2 lies.
> >
> >
> > I'm curious why you expect to see SVE2 added to Apple's cores?
> > What would be the benefit, when there is already
> > NEON and AMX? Or to reverse the question, what do you think Apple would be losing by not having SVE2?
> >
> > If ARM plans to eventually make SVE2 a required part of a future iteration of ARMv9 rather
> > than optional, it would make sense to add it sooner rather than later. That goes double
> > if ARM plans to someday deprecate NEON and make it optional in the future.
> >
> > If SVE2 is going to remain optional for the foreseeable future, and NEON mandatory,
> > it seems like Apple would be better off putting their resources into AMX.
>
> As I have a million times (but no-one pays any attention) SVE2 is not about wide
> vectors, it is about being a better compiler target for "general" loops.
> It will make Apple's CPUs faster, because it will allow more code to be vectorized,
> and vectorized at lower overhead, and that's why they will add it.
>
> AMX and SVE2 solve very different problems -- as I said in the first post.
What about NEON, which solves the exact same problem? SVE2 allows wider vectors, but unless you actually ship with significantly wider vectors what's the difference between e.g. 2x256b SVE2 and 4x128b NEON? Yeah SVE2 is where the development is taking place now, but most of the new instructions are 'AI' related stuff Apple supports via the NPU.
Now if they plan to offer 512 bit wide SVE2 on M2 Max for the higher end stuff while keeping it at a more reasonable 128 or 256 bit width for phones, tablets and lower end Macs maybe it makes sense.