By: Michael S (already5chosen.delete@this.yahoo.com), May 23, 2022 1:04 am
Room: Moderated Discussions
Doug S (foo.delete@this.bar.bar) on May 22, 2022 8:46 pm wrote:
> --- (---.delete@this.redheron.com) on May 22, 2022 1:59 pm wrote:
> > Doug S (foo.delete@this.bar.bar) on May 22, 2022 11:01 am wrote:
> > > --- (---.delete@this.redheron.com) on May 21, 2022 6:59 pm wrote:
> > > > Maybe yes, maybe no, the devil is in the details.
> > > > JPEG XL has an option for ANS as the entropy coding and Apple's
> > > > version of NEON has ANS accelerator instructions...
> > > > https://patents.google.com/patent/US20210072994A1
> > >
> > >
> > > Apple's implementation of NEON has its own instructions that ARM hasn't defined?
> >
> > Why not?
> > There's a whole range of instruction encodings that can be used by
> > licensees as they wish. Apple uses it (as far as is known) for
> > - AMX
> >
> > - LZ engine (conceptually like AMX, associated with a cluster). Not as high
> > compression as SW LZ, but a lot faster, so used under conditions of either
> > + extreme memory pressure (so the priority is more on fast page encode than on 10% smaller pages)
> > + low power conditions (obv more relevant for mobile devices)
> >
> > - these NEON ANS extensions.
>
>
> I knew there were ranges of encodings that could be used as licensees wanted for new stuff like
> AMX, I just hadn't realized Apple was also extending existing facilities like NEON. Seems like
> that could get complicated if ARM extended NEON with similar but not quite identical facilities
> and Apple ended up forced to support two opcodes that do almost but not quite the same thing.
>
> Though I guess I've been operating under the assumption that since the release of SVE, ARM has not and does not
> plan to further extend NEON.
SVE was included as an option in Armv8.2-A
Complex Numbers support (addition to NEON) is part of Armv8.3-A
SHA3 and SHA512 support (addition to NEON) is part of Armv8.4-A
> Is that the case, NEON pretty much "done" at this time as far as future changes from
> ARM?
Look's like NEON was not 'done' until v9. Hard to predict how things are going to be under v9.
> If so, Apple extending NEON might indicate they have no plans to implement SVE2 anytime soon - since extending
> SVE2 could as above potentially more problematic if ARM is simultaneously extending it themselves.
SVE occupies 6% of ARM's total op-code space. Plenty of room here.
For comparison, AVX512 (EVEX) has to live with ~0.1% of iAMD64 op-code space (10 fixed bits in first 2 bytes).
> --- (---.delete@this.redheron.com) on May 22, 2022 1:59 pm wrote:
> > Doug S (foo.delete@this.bar.bar) on May 22, 2022 11:01 am wrote:
> > > --- (---.delete@this.redheron.com) on May 21, 2022 6:59 pm wrote:
> > > > Maybe yes, maybe no, the devil is in the details.
> > > > JPEG XL has an option for ANS as the entropy coding and Apple's
> > > > version of NEON has ANS accelerator instructions...
> > > > https://patents.google.com/patent/US20210072994A1
> > >
> > >
> > > Apple's implementation of NEON has its own instructions that ARM hasn't defined?
> >
> > Why not?
> > There's a whole range of instruction encodings that can be used by
> > licensees as they wish. Apple uses it (as far as is known) for
> > - AMX
> >
> > - LZ engine (conceptually like AMX, associated with a cluster). Not as high
> > compression as SW LZ, but a lot faster, so used under conditions of either
> > + extreme memory pressure (so the priority is more on fast page encode than on 10% smaller pages)
> > + low power conditions (obv more relevant for mobile devices)
> >
> > - these NEON ANS extensions.
>
>
> I knew there were ranges of encodings that could be used as licensees wanted for new stuff like
> AMX, I just hadn't realized Apple was also extending existing facilities like NEON. Seems like
> that could get complicated if ARM extended NEON with similar but not quite identical facilities
> and Apple ended up forced to support two opcodes that do almost but not quite the same thing.
>
> Though I guess I've been operating under the assumption that since the release of SVE, ARM has not and does not
> plan to further extend NEON.
SVE was included as an option in Armv8.2-A
Complex Numbers support (addition to NEON) is part of Armv8.3-A
SHA3 and SHA512 support (addition to NEON) is part of Armv8.4-A
> Is that the case, NEON pretty much "done" at this time as far as future changes from
> ARM?
Look's like NEON was not 'done' until v9. Hard to predict how things are going to be under v9.
> If so, Apple extending NEON might indicate they have no plans to implement SVE2 anytime soon - since extending
> SVE2 could as above potentially more problematic if ARM is simultaneously extending it themselves.
SVE occupies 6% of ARM's total op-code space. Plenty of room here.
For comparison, AVX512 (EVEX) has to live with ~0.1% of iAMD64 op-code space (10 fixed bits in first 2 bytes).