ARM SVE Streaming Mode

By: Adrian (, July 26, 2021 12:21 am
Room: Moderated Discussions
Adrian ( on July 25, 2021 9:16 pm wrote:
> dmcq ( on July 25, 2021 5:36 pm wrote:
> > Introducing the Scalable Matrix Extension for the Armv9-A Architecture
> >
> > I noticed ARM have issued yet another of their planned future
> > architecture blogs, this one is called "Scalable
> > Matrix Extension". Same sort of idea as what Intel is implementing as far as I can see. Except there is
> > a strange aspect in that it requires a new mode for SVE called "Streaming mode" SVE. They talk about having
> > the new SME instructons and a significant subset of the existing SVE2 instructions. And they say that
> > one could have a longer length for the registers in streaming and non-streaming mode. As far as I can
> > make out in fact only very straightforward operations are included in streaming mode.
> >
> > I guess that instead of being 'RISC' instructions these would have a loop dealing with widths greater
> > than the hardware SVE register or memory or cache width. Thy've implemented something like this in
> > the Cortex-M Helium extension, but I'd have thought they could just rely on OoO for the larger application
> > processor If it is so they can have larger tiles in their matrix multiplicatin I'd have thought there
> > would be other tricks that could do the job without a new mode. However I can't see they would have
> > put in a new mode without it being very important to them. Am I missing something?
> You are probably right about the necessity of looping in certain cases.
> They explain clearly enough why a streaming SVE mode is needed, to be able to present to
> software an apparent vector register width that is larger than the width of the ALU's.
> This "streaming" mode is actually exactly like the traditional vector computers have operated. For
> example a Cray-1 had an apparent vector register width of 1024 bits, but the vector operations were
> computed by a 64-bit pipelined ALU, in multiple clock cycles, i.e. "looping", like you say.

So the SVE Streaming Mode, by switching the apparent vector register length, will allow the choice between 2 sets of vector instructions, one with low-latency instructions processing few data per instruction and one with high-latency instructions processing many data per instruction.

The high-latency instructions available in the "SVE Streaming Mode" will make it easier to achieve the maximum throughput, by requiring fewer instructions to do the work, but obviously they are not suitable for every task.

The ability to change modes to get the desired compromise between latency and throughput is certainly very valuable, as long as switching modes will not require an excessive time.

I assume that the ABI will specify that the non-streaming SVE mode is default, so any procedure needing the streaming mode will have to switch modes upon function entry and function exit, so it will have to gain enough from the high-latency complex instructions to recover the time lost with mode switching.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
ARM Scalable Matrix Extensiondmcq2021/07/25 05:36 PM
  ARM Scalable Matrix ExtensionAdrian2021/07/25 09:16 PM
    Sorry, typosAdrian2021/07/25 10:32 PM
    ARM SVE Streaming ModeAdrian2021/07/26 12:21 AM
      ARM SVE Streaming Modedmcq2021/07/26 04:18 AM
        ARM SVE Streaming ModeAdrian2021/07/26 04:45 AM
    ARM Scalable Matrix ExtensionMichael S2021/07/26 02:53 AM
      ARM Scalable Matrix ExtensionAdrian2021/07/26 03:41 AM
        Inner & outer productAdrian2021/07/26 03:52 AM
      ARM Scalable Matrix ExtensionRayla2021/07/26 05:08 AM
      ARM Scalable Matrix Extensiondmcq2021/07/26 05:38 AM
        ARM Scalable Matrix ExtensionDoug S2021/07/26 11:38 AM
          ARM Scalable Matrix ExtensionBrett2021/07/26 01:54 PM
            ARM Scalable Matrix Extension---2021/07/26 05:48 PM
              ARM Scalable Matrix Extensiondmcq2021/07/27 02:39 AM
      ARM Scalable Matrix ExtensionAnon2021/07/26 06:08 AM
    ARM Scalable Matrix Extensionlkcl2022/07/28 03:38 PM
      ARM Scalable Matrix Extensiondmcq2022/07/29 02:24 PM
        ARM Scalable Matrix Extensionlkcl2022/07/29 03:44 PM
Reply to this Topic
Body: No Text
How do you spell tangerine? ūüćä