Interesting ARM compiler data

By: noko (noko.delete@this.noko.com), August 8, 2022 9:30 pm
Room: Moderated Discussions
--- (---.delete@this.redheron.com) on August 8, 2022 2:54 pm wrote:
> I found a lot interesting in this update:
>
> https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/arm-compiler-for-linux-and-arm-performance-libraries-22-0
>
> On the plus side, it's clear that there still remains some low-hanging fruit for the ARM ecosystem
> when compared to the x86 ecosystem, even in fairly basic things like special functions and BLAS.
>
> On the negative side, the SVE results seem disappointing. One can spin this in a few different ways (isolated
> loops won't show the size improvements from simpler loops without head and tails, as opposed to real large
> apps using large shared libraries; these are already vector-dense loops, whereas SVE should help more code
> that's less trivially vectorizable), and clearly the SVE optimization has only just begun.
> Still, somewhat disappointing results. (Except of course that himeno number. Given how
> much Phoronix pushes himeno, I look forward to seeing Michael try to justify this!)

If anything I think it shows the opposite - himeno has a large, easily vectorizable inner loop with contagious accesses. But looking at the autovectorizer output, it's doing 1.4 loads per vector op. Without actually running it on a Neoverse-V1, I'd hazard a guess that the NEON version is saturating the 5-wide decode. So it's a perfect use case for wider ALUs (and more load bandwidth), rather than fancy instructions.

The only fancy instruction the SVE version really benefits from is reg+reg addressing in 256-bit wide contiguous vector loads; NEON's LDP only supports immediate offsets.

> Perhaps hoping for better loops in generic code requires
> SVE2 and we can't hope for much of that with just SVE?

Which instructions are you thinking of? I don't see much in SVE2 that would be very useful to autovectorization / SPMD.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Interesting ARM compiler data---2022/08/08 02:54 PM
  Interesting ARM compiler datanoko2022/08/08 09:30 PM
    V1 bottleneckJan Wassenberg2022/08/09 12:38 AM
    Interesting ARM compiler data---2022/08/09 10:15 AM
      Interesting ARM compiler datanoko2022/08/09 11:34 AM
        Interesting ARM compiler dataJörn Engel2022/08/09 01:45 PM
        Interesting ARM compiler data---2022/08/09 01:49 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊