12:30 "[ISA] do not matter very much"

By: Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr), February 25, 2021 1:02 am
Room: Moderated Discussions
rwessel (rwessel.delete@this.yahoo.com) on February 24, 2021 8:45 am wrote:
> Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on February 24, 2021 6:24 am wrote:
> > Wilco (wilco.dijkstra.delete@this.ntlworld.com) on February 24, 2021 4:37 am wrote:
> > > Anon (no.delete@this.spam.com) on February 23, 2021 6:26 am wrote:
> > > > Wilco (wilco.dijkstra.delete@this.ntlworld.com) on February 23, 2021 3:48 am wrote:
> > > > > You forgot the sarcasm tag :-)
> > > >
> > > > Poor implementations don't prove an efficient implementation isn't possible.
> > >
> > > Given even a bad software implementation (the SSE2 one) thrashes rep movsb on modern
> > > cores, it proves that it is not at all trivial to do a good hardware memcpy like
> > > you suggested. It's not like Intel/AMD haven't been trying for years.
> > >
> > > > > bench-memcpy-random in GLIBC shows just how "efficient" rep movsb (__memcpy_erms) is on my 3700X:
> > > >
> > > > Your benchmark shows how hard it is to find the optimal software implementation of memcpy, there
> > > > are 7 variations and a surprising fastest one (__memcpy_sse2_unaligned, what happens to AVX?), this
> > > > show a somewhat lazy AMD that didn't even put their microcode to emit the best uop sequence.
> > >
> > > There isn't a single optimal implementation of memcpy for all possible use-cases. Software allows you
> > > to select whichever one works best, and you can tweak it further, remove bottlenecks etc. However with
> > > hardware you are stuck with the one in your CPU. In order for hardware memcpy to work out, it has to
> > > be as fast as the best software implementation. So far nobody has proven this is feasible.
> > >
> > > Wilco
> >
> > To me, it looks a bit strange to talk about either microcode or hardware for memcpy (with an OoO core):
> > - microcode is not the exact code which is inserted into the instruction execution windows,
> > if you have a rep movsb with initial ecx=7, you have to fill the execution instruction window
> > with 7 reads of the source address, 7 writes of the destination address (or an optimisation
> > if reading multiple of bytes), and a clear of ecx if still alive. The problem is probably how
> > many execution windows instructions you can insert in one cycle executing microcode.
> > - hardware memcpy would mean some kind of DMA (into caches) and pausing the execution window?
> >
> > What is probably needed is specialised "execution window instructions"
> > which can read up to a cache line, another
> > to mask / insert from another cache line, and a third to write
> > up to a cache line. Then the "rep movsb" microcode
> > inserts (maybe a lot of) such "execution window instructions" into the "instructions in flight".
> > Maybe that is what you meant, then please ignore that message...
>
>
> Why? Send one micro-op to the LSU, and let it figure it out.

That memcpy micro-op will have an unlimited amount of dependencies: imagine you have such micro-ops in flight (I should not use assembly notation for micro-op, but it is easier):
mov [%esi + 8], #10
mov [%esi + 12], %eax
mov [%esi + 16], %ebx
memcpy %edi, %esi, 32
...
mov %eax, [%edi + 16]

Then you might have a lot of those "memcpy" micro-op in your 100 instructions in flight...

And for what I understand, micro-ops should have a pre-defined execution time because you allocate their activity cycles ahead of time.

But I am not a micro-op specialist so may be completely wrong...
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
New Lex Fridman interview with Jim KellerJohnG2021/02/19 11:25 PM
  12:30 "[ISA] do not matter very much" (NT)Moritz2021/02/20 09:45 AM
    AAARGH, what are we going to argue about then? (NT)j2021/02/20 02:43 PM
    Blasphemy! (NT):]2021/02/21 04:49 AM
    12:30 "[ISA] do not matter very much"anon22021/02/21 10:25 PM
      12:30 "[ISA] do not matter very much"Brett2021/02/21 11:59 PM
        12:30 "[ISA] do not matter very much"Etienne Lorrain2021/02/22 01:17 AM
      12:30 "[ISA] do not matter very much"Dummond D. Slow2021/02/22 08:57 AM
        12:30 "[ISA] do not matter very much"Anon2021/02/22 10:52 AM
          12:30 "[ISA] do not matter very much"juanrga2021/02/22 11:01 AM
          12:30 "[ISA] do not matter very much"Mark Roulo2021/02/22 11:54 AM
          ARM being a good idea doesn't mean it would have worked for AMDDummond D. Slow2021/02/22 01:34 PM
            ARM being a good idea doesn't mean it would have worked for AMDAnon2021/02/22 03:25 PM
              ARM being a good idea doesn't mean it would have worked for AMDDummond D. Slow2021/02/22 04:55 PM
                ARM being a good idea doesn't mean it would have worked for AMDDoug S2021/02/23 12:03 PM
                  ARM being a good idea doesn't mean it would have worked for AMDDummond D. Slow2021/02/23 12:27 PM
                    ARM being a good idea doesn't mean it would have worked for AMDBrett2021/02/23 03:57 PM
                      3rd parties licensing ARM coresAnon2021/02/25 04:01 AM
                        3rd parties licensing ARM coresAnon2021/02/25 04:48 AM
                        3rd parties licensing ARM coresdmcq2021/02/25 06:01 AM
                          3rd parties licensing ARM coresDummond D. Slow2021/02/25 09:17 AM
                            3rd parties licensing ARM coresAnon2021/02/25 10:11 AM
                              3rd parties licensing ARM coresAnon2021/02/26 02:54 AM
                              3rd parties licensing ARM coresDummond D. Slow2021/02/26 10:01 AM
            ARM being a good idea doesn't mean it would have worked for AMDLinus Torvalds2021/02/22 05:06 PM
              ARM being a good idea doesn't mean it would have worked for AMDDummond D. Slow2021/02/22 07:19 PM
              ARM being a good idea doesn't mean it would have worked for AMDanon22021/02/22 07:28 PM
              ARM being a good idea doesn't mean it would have worked for AMDdmcq2021/02/23 05:35 AM
                ARM being a good idea doesn't mean it would have worked for AMDJukka Larja2021/02/23 07:12 AM
                  ARM being a good idea doesn't mean it would have worked for AMDSimon Farnsworth2021/02/23 08:42 AM
                    ARM being a good idea doesn't mean it would have worked for AMDJukka Larja2021/02/24 06:03 AM
                ARM may have been a threat to Intelwumpus2021/02/23 08:30 AM
      12:30 "[ISA] do not matter very much"blaine2021/02/22 09:37 AM
        12:30 "[ISA] do not matter very much"anon22021/02/22 07:17 PM
          12:30 "[ISA] do not matter very much"Anon2021/02/23 03:05 AM
            12:30 "[ISA] do not matter very much"Wilco2021/02/23 03:48 AM
              12:30 "[ISA] do not matter very much"Bigos2021/02/23 03:55 AM
                12:30 "[ISA] do not matter very much"Wilco2021/02/23 04:15 AM
                  12:30 "[ISA] do not matter very much"Bigos2021/02/23 05:16 AM
                12:30 "[ISA] do not matter very much"Travis Downs2021/02/26 11:46 PM
              12:30 "[ISA] do not matter very much"Anon2021/02/23 06:26 AM
                12:30 "[ISA] do not matter very much"anon22021/02/23 04:35 PM
                  12:30 "[ISA] do not matter very much"Anon2021/02/24 07:57 AM
                12:30 "[ISA] do not matter very much"Wilco2021/02/24 04:37 AM
                  12:30 "[ISA] do not matter very much"Etienne Lorrain2021/02/24 06:24 AM
                    12:30 "[ISA] do not matter very much"Anon2021/02/24 08:11 AM
                    12:30 "[ISA] do not matter very much"rwessel2021/02/24 08:45 AM
                      12:30 "[ISA] do not matter very much"Etienne Lorrain2021/02/25 01:02 AM
                        12:30 "[ISA] do not matter very much"rwessel2021/02/25 04:51 AM
                        12:30 "[ISA] do not matter very much"Anon2021/02/25 04:53 AM
                  12:30 "[ISA] do not matter very much"Anon2021/02/24 08:07 AM
                    12:30 "[ISA] do not matter very much"Wilco2021/02/24 11:37 AM
                      runtime selection vs. heterogenous cores?Matt Sayler2021/02/24 06:10 PM
                        runtime selection vs. heterogenous cores?Wilco2021/02/26 05:22 AM
            12:30 "[ISA] do not matter very much"anon22021/02/23 04:20 AM
              12:30 "[ISA] do not matter very much"Anon2021/02/23 06:21 AM
                12:30 "[ISA] do not matter very much"none2021/02/23 07:37 AM
                  12:30 "[ISA] do not matter very much"rwessel2021/02/23 09:44 AM
                    12:30 "[ISA] do not matter very much"anon22021/02/23 04:30 PM
                      12:30 "[ISA] do not matter very much"Anon2021/02/24 08:25 AM
                        12:30 "[ISA] do not matter very much"anon.12021/02/25 06:13 AM
                  12:30 "[ISA] do not matter very much"Anon2021/02/24 08:44 AM
                12:30 "[ISA] do not matter very much"anon22021/02/23 03:51 PM
                  12:30 "[ISA] do not matter very much"Anon2021/02/24 08:31 AM
            12:30 "[ISA] do not matter very much"vvid2021/02/23 06:41 AM
              12:30 "[ISA] do not matter very much"Michael S2021/02/23 08:52 AM
                12:30 "[ISA] do not matter very much"rwessel2021/02/23 09:33 AM
                  12:30 "[ISA] do not matter very much"Linus Torvalds2021/02/23 11:44 AM
                    12:30 "[ISA] do not matter very much"rwessel2021/02/23 12:21 PM
                      12:30 "[ISA] do not matter very much"Linus Torvalds2021/02/23 12:30 PM
                        12:30 "[ISA] do not matter very much"Andrey2021/02/25 03:06 AM
                          12:30 "[ISA] do not matter very much"Anon2021/02/25 05:04 AM
                            12:30 "[ISA] do not matter very much"Andrey2021/02/25 05:54 AM
                              12:30 "[ISA] do not matter very much"Anon2021/02/25 06:33 AM
                          12:30 "[ISA] do not matter very much"Linus Torvalds2021/02/25 10:35 AM
                            12:30 "[ISA] do not matter very much"Andrey2021/02/25 01:34 PM
                              12:30 "[ISA] do not matter very much"Etienne Lorrain2021/02/26 01:18 AM
                              12:30 "[ISA] do not matter very much"dmcq2021/02/26 03:23 PM
                12:30 "[ISA] do not matter very much"Anon2021/02/24 08:45 AM
              12:30 "[ISA] do not matter very much"Gabriele Svelto2021/02/23 09:15 AM
          Context of ISA doesn't matterPaul A. Clayton2021/02/26 12:03 PM
  Is there a text version? (NT)Foo_2021/02/20 04:33 PM
    good question (NT)Michael S2021/02/21 04:31 AM
    Is there a text version?:]2021/02/21 10:34 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?