SIMD syntax

By: hobold (, November 1, 2020 7:22 am
Room: Moderated Discussions
Jukka Larja ( on October 31, 2020 8:14 am wrote:
> I'm not sure what std::simd would include, but at least on a quick glance I don't see GCC's extensions
> would matter at all for us. It's not like writing a wrapper of our own (for (S)SSE(2/3) and Neon)
> was all that difficult or took a lot of time.

Template metaprogramming is not rocket surgery. It's not trivial either, though. And you have to keep validating that new compiler versions don't break performance.

I was thinking of the contrast illustrated, for example, here between a plain formula on the one hand, and a mess of nested intrinsics on the other hand:

The GCC extension allows writing a plain formula, too, when targeting a SIMD backend.

> The problem is that we don't have much code that
> could make use of such abstraction (actually, we mostly use four wide float vectors as direct
> replacement of three wide float vectors. We can't even make use of the last item).
I think the terminology "vector" for SIMD parallelism is misleading people at large into thinking in terms of vector math. There is some overlap, but as you noticed, the mathy vectors aren't generally too useful for extracting SIMD parallelism.

Usually the problem needs to be "rotated by 90 degrees", i.e. in this case you'd have a SIMD vector with elements x1, x2, x3, x4, another SIMD vector with elements y1, y2, y3, y4, and so on. Effectively working with four mathy 3-vectors inside three SIMD 4-vectors.

> I'm sure there are some places where rethinking data structures or algorithms could
> allow making use of SIMD, but if we were to go through the trouble, we'd consider
> using GPU first. After that consideration, there's practically nothing left.
Plus, GPU programming interfaces don't mislead you as much, as they don't really make their SIMD width visible to the programmer.

BTW, I found it useful to think about re-arranging data structures not in terms of vectors, but in terms of cache friendliness. Locality, yes, but also things like: invariant data should not be interleaved with changing data (to save write back bandwidth).

That kind of data centric programming enables optimizations which also help ordinary scalar code. And they are a large step towards SIMD, where the programming model forces locality by treating a number of subsequent data items as one indivisible unit.

Doing optimizations for caches "naturally" tends to prefer structure of arrays format.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Expiry of x86-64 patentsBeastian2019/04/19 08:05 AM
  Expiry of x86-64 patentsGian-Carlo Pascutto2019/04/19 08:46 AM
    Expiry of x86-64 patentsBeastian2019/04/19 09:06 AM
    Expiry of x86-64 patentsJukka Larja2019/04/19 09:44 AM
      Expiry of x86-64 patentsGian-Carlo Pascutto2019/04/19 10:12 AM
        Expiry of x86-64 patentsJukka Larja2019/04/19 11:41 AM
          Expiry of x86-64 patentsRobert Williams2019/04/19 12:18 PM
          Expiry of x86-64 patentsGian-Carlo Pascutto2019/04/19 01:35 PM
          Expiry of x86-64 patentsIntelUser20002020/10/30 01:17 AM
            Expiry of x86-64 patentsJukka Larja2020/10/30 06:49 AM
              Expiry of x86-64 patentsme2020/10/30 08:47 AM
                Expiry of x86-64 patentsJukka Larja2020/10/30 08:52 AM
                  Expiry of x86-64 patentsMark Roulo2020/10/30 09:21 AM
                    Expiry of x86-64 patentsJukka Larja2020/10/30 10:29 AM
                      Expiry of x86-64 patentsMark Roulo2020/10/30 10:42 AM
                        Expiry of x86-64 patentsJukka Larja2020/10/30 08:04 PM
                          SIMD syntaxhobold2020/10/31 05:54 AM
                            SIMD syntaxJukka Larja2020/10/31 08:14 AM
                              SIMD syntaxhobold2020/11/01 07:22 AM
                                SIMD syntaxJukka Larja2020/11/01 10:11 AM
                                  SIMD syntaxhobold2020/11/02 04:33 AM
                          Expiry of x86-64 patentsme2020/10/31 02:01 PM
                            Expiry of x86-64 patentsJukka Larja2020/10/31 08:23 PM
                              Expiry of x86-64 patentsFoo_2020/11/01 03:48 AM
                                Expiry of x86-64 patentsJukka Larja2020/11/01 06:01 AM
                      Expiry of x86-64 patentsAdrian2020/10/30 11:02 AM
                        Expiry of x86-64 patentsBigos2020/10/30 12:20 PM
      Expiry of x86-64 patentsGeoff Langdale2019/04/19 01:52 PM
        Expiry of x86-64 patentsJukka Larja2019/04/19 08:38 PM
      Expiry of x86-64 patentsYuhong Bao2019/04/20 01:35 PM
  Expiry of x86-64 patentsDoug S2019/04/19 09:40 AM
    Expiry of x86-64 patentsBeastian2019/04/19 10:10 AM
      Expiry of x86-64 patentsRobert Williams2019/04/20 07:15 AM
        Expiry of x86-64 patentsRobert Williams2020/10/28 05:42 AM
  Expiry of x86-64 patentsanyone2019/04/20 06:11 AM
    Expiry of x86-64 patentsGroo2019/04/20 06:29 AM
      Expiry of x86-64 patentswumpus2019/04/20 07:32 AM
      Expiry of x86-64 patentsblaine2020/10/30 11:03 AM
        Expiry of x86-64 patentsDavid Kanter2020/10/30 07:59 PM
  Intel vs AMD patentsYuhong Bao2019/04/20 01:32 PM
    Intel vs AMD patentsBeastian2019/04/20 02:35 PM
  Expiry of x86-64 patentsTravis Downs2019/04/20 06:24 PM
    Expiry of x86-64 patentsnone2019/04/21 06:36 AM
      Expiry of x86-64 patentssomebody2019/11/27 09:44 AM
      Expiry of x86-64 patentsAnon32019/11/27 04:16 PM
        Expiry of x86-64 patentsTravis Downs2019/11/27 05:17 PM
      Expiry of x86-64 patentsMontaray Jack2019/11/27 11:03 PM
        Expiry of x86-64 patentsnone2019/11/28 12:57 AM
          Expiry of x86-64 patentsdmcq2019/11/28 10:20 AM
            Expiry of x86-64 patentsMontaray Jack2019/11/29 04:00 AM
Reply to this Topic
Body: No Text
How do you spell tangerine? ūüćä