Interesting comment about rep instructions & code size

By: Chester (lamchester.delete@this.gmail.com), January 15, 2020 5:16 pm
Room: Moderated Discussions
Travis Downs (travis.downs.delete@this.gmail.com) on January 15, 2020 2:40 pm wrote:
> Gabriele Svelto (gabriele.svelto.delete@this.gmail.com) on January 15, 2020 6:12 am wrote:
> >
Rather than tune for individual microarchitecture variations, we would prefer to leverage on fast string
> > operations provided by the ISA (for example “rep movsb” on x86). This allows us to leverage wider
> > data paths over time, without having to build custom dispatch logic which carries its own overheads.
> >
> >
And also
> >
> >
Small Code Footprint: Our implementations consist of concise patterns for working with chunks
> > of data. While further specialization can produce better results on microbenchmarks, we did
> > not see these wins materialize on macrobenchmarks measuring application productivity.
> >
> >

>
> Well they are going to be disappointed by the performance of rep movsb even
> on recent hardward, for the very small copies they are say are important.

Maybe the cost of saving icache misses and branch mispredicts would be worth it?

>
> I find this quote a bit enigmatic: so are they leveraging the rep move instructions,
> in which case the implementation is very simple, or are they doing what they said elsewhere,
> which is using compile-time selected instructions to implement a compact memcopy.
>
> I am surprised they said PLT-based dispatch isn't efficient: as far as I can tell, it's basically zero
> cost if you were calling the memcpy function: either way you are making a call through the PLT, so what
> downside is there to having the machine-appropriate entry be selected at dynamic load time?

What's PLT-based dispatch? I googled and couldn't find anything on it.

> Maybe they mean they plan to inline in more cases versus making a call.
>
> This will pessimize bit copies, so it might be worth having a separate memcpy_big
> for when you know its big, that uses the widest instructions, etc.
>
> They should give their size distribution stats also in a "byte weighted" or
> "time weighted" format: maybe 99% of your memcpy *calls* are small, but if
> the big ones are big enough you could spend a lot of total time in them.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
LLVM comments on mem*Maynard Handley2020/01/14 01:51 PM
  LLVM comments on mem*Anon32020/01/15 06:28 AM
  Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/15 07:12 AM
    Interesting comment about rep instructions & code sizenone2020/01/15 08:59 AM
      Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 03:56 AM
        Interesting comment about rep instructions & code sizeLinus Torvalds2020/01/16 10:12 AM
          ISA support for constant count loopsPaul A. Clayton2020/01/16 11:28 AM
            ISA support for constant count loopsGabriele Svelto2020/01/16 02:15 PM
              PowerPC "front-end registers"Paul A. Clayton2020/01/16 03:34 PM
              ISA support for constant count loopsTravis Downs2020/01/16 05:21 PM
                ISA support for constant count loopsLinus Torvalds2020/01/16 08:41 PM
                  ISA support for constant count loopsTravis2020/01/16 09:48 PM
                    ISA support for constant count loopsBrett2020/01/17 01:28 AM
              Branch to CTRMaya2020/01/18 08:15 AM
                Branch to CTRGabriele Svelto2020/01/18 01:14 PM
            ISA support for constant count loopsanon2020/01/17 08:28 AM
              ISA support for constant count loopsTravis Downs2020/01/17 08:34 AM
            ISA support for constant count loops: ineffective compared to micro-threads2020/01/20 08:02 AM
              ISA support for constant count loops: ineffective compared to micro-threadssomeone2020/01/20 12:23 PM
                ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 09:23 AM
              ISA support for too slow computersEtienne2020/01/21 02:42 AM
                ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 09:18 AM
                  ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 10:04 AM
                  ISA support for constant count loops: ineffective compared to micro-threadsHeikki Kultala2020/01/22 10:47 AM
                    ISA support for constant count loops: ineffective compared to micro-threadsdmcq2020/01/22 01:31 PM
                    ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 03:28 PM
                      ISA support for constant count loops: ineffective compared to micro-threadsEtienne2020/01/22 04:35 PM
          Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 02:00 PM
    Interesting comment about rep instructions & code sizeTravis Downs2020/01/15 03:40 PM
      Interesting comment about rep instructions & code sizeChester2020/01/15 05:16 PM
        Interesting comment about rep instructions & code sizeTravis Downs2020/01/15 05:50 PM
          Interesting comment about rep instructions & code sizeChester2020/01/15 07:24 PM
            Interesting comment about rep instructions & code sizeTravis Downs2020/01/16 02:26 PM
              Interesting comment about rep instructions & code sizeChester2020/01/17 01:16 PM
                Interesting comment about rep instructions & code sizeTravis Downs2020/01/17 03:41 PM
        Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 03:53 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell purple?