Interesting comment about rep instructions & code size

By: Travis Downs (travis.downs.delete@this.gmail.com), January 15, 2020 3:40 pm
Room: Moderated Discussions
Gabriele Svelto (gabriele.svelto.delete@this.gmail.com) on January 15, 2020 6:12 am wrote:
>
Rather than tune for individual microarchitecture variations, we would prefer to leverage on fast string
> operations provided by the ISA (for example “rep movsb” on x86). This allows us to leverage wider
> data paths over time, without having to build custom dispatch logic which carries its own overheads.
>
>
And also
>
>
Small Code Footprint: Our implementations consist of concise patterns for working with chunks
> of data. While further specialization can produce better results on microbenchmarks, we did
> not see these wins materialize on macrobenchmarks measuring application productivity.
>
>


Well they are going to be disappointed by the performance of rep movsb even on recent hardward, for the very small copies they are say are important.

I find this quote a bit enigmatic: so are they leveraging the rep move instructions, in which case the implementation is very simple, or are they doing what they said elsewhere, which is using compile-time selected instructions to implement a compact memcopy.

I am surprised they said PLT-based dispatch isn't efficient: as far as I can tell, it's basically zero cost if you were calling the memcpy function: either way you are making a call through the PLT, so what downside is there to having the machine-appropriate entry be selected at dynamic load time?

Maybe they mean they plan to inline in more cases versus making a call.

This will pessimize bit copies, so it might be worth having a separate memcpy_big for when you know its big, that uses the widest instructions, etc.

They should give their size distribution stats also in a "byte weighted" or "time weighted" format: maybe 99% of your memcpy *calls* are small, but if the big ones are big enough you could spend a lot of total time in them.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
LLVM comments on mem*Maynard Handley2020/01/14 01:51 PM
  LLVM comments on mem*Anon32020/01/15 06:28 AM
  Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/15 07:12 AM
    Interesting comment about rep instructions & code sizenone2020/01/15 08:59 AM
      Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 03:56 AM
        Interesting comment about rep instructions & code sizeLinus Torvalds2020/01/16 10:12 AM
          ISA support for constant count loopsPaul A. Clayton2020/01/16 11:28 AM
            ISA support for constant count loopsGabriele Svelto2020/01/16 02:15 PM
              PowerPC "front-end registers"Paul A. Clayton2020/01/16 03:34 PM
              ISA support for constant count loopsTravis Downs2020/01/16 05:21 PM
                ISA support for constant count loopsLinus Torvalds2020/01/16 08:41 PM
                  ISA support for constant count loopsTravis2020/01/16 09:48 PM
                    ISA support for constant count loopsBrett2020/01/17 01:28 AM
              Branch to CTRMaya2020/01/18 08:15 AM
                Branch to CTRGabriele Svelto2020/01/18 01:14 PM
            ISA support for constant count loopsanon2020/01/17 08:28 AM
              ISA support for constant count loopsTravis Downs2020/01/17 08:34 AM
            ISA support for constant count loops: ineffective compared to micro-threads2020/01/20 08:02 AM
              ISA support for constant count loops: ineffective compared to micro-threadssomeone2020/01/20 12:23 PM
                ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 09:23 AM
              ISA support for too slow computersEtienne2020/01/21 02:42 AM
                ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 09:18 AM
                  ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 10:04 AM
                  ISA support for constant count loops: ineffective compared to micro-threadsHeikki Kultala2020/01/22 10:47 AM
                    ISA support for constant count loops: ineffective compared to micro-threadsdmcq2020/01/22 01:31 PM
                    ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 03:28 PM
                      ISA support for constant count loops: ineffective compared to micro-threadsEtienne2020/01/22 04:35 PM
          Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 02:00 PM
    Interesting comment about rep instructions & code sizeTravis Downs2020/01/15 03:40 PM
      Interesting comment about rep instructions & code sizeChester2020/01/15 05:16 PM
        Interesting comment about rep instructions & code sizeTravis Downs2020/01/15 05:50 PM
          Interesting comment about rep instructions & code sizeChester2020/01/15 07:24 PM
            Interesting comment about rep instructions & code sizeTravis Downs2020/01/16 02:26 PM
              Interesting comment about rep instructions & code sizeChester2020/01/17 01:16 PM
                Interesting comment about rep instructions & code sizeTravis Downs2020/01/17 03:41 PM
        Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 03:53 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell purple?