ISA support for constant count loops

By: Travis Downs (travis.downs.delete@this.gmail.com), January 16, 2020 5:21 pm
Room: Moderated Discussions
Gabriele Svelto (gabriele.svelto.delete@this.gmail.com) on January 16, 2020 1:15 pm wrote:
> Paul A. Clayton (paaronclayton.delete@this.gmail.com) on January 16, 2020 10:28 am wrote:
> > In theory an ISA could provide a simple mechanism to load
> > the loop count for constant iteration (inner) loops.
> > A simple load constant instruction could trigger microarchitectural optimization of branch prediction (and
> > other functions). For constant count loops there is no
> > reason for branch misprediction other than difficulty
> > of recognizing the count and the associated branch. Even
> > much of the overhead of the branch instruction could
> > be reduced (like some DSPs' special support for loops). For
> > x86, this could just be idiom recognition for REP-prefixed
> > instructions; loading a constant into the count register would prepare for such a loop.
>
> POWER - and PowerPC before it - has a special register called the count register (CTR, section 2.3.3 in
> the PowerISA 3.0 doc) that can be used implicitly by branch instructions to loop. A field in a conditional
> branch instruction can be set to decrement the CTR and then branch if it's zero or if it's not zero.
>
> In older processors using it guaranteed perfectly predictable loops but it came
> at a cost because the register was not renamed, so loading the count was a serializing
> instruction. I haven't seen it used much but it might have been.

Yeah I thinking how hard Paul's suggestion would be on a typical big OoO, and it doesn't seem too bad if the value was constant, as he suggested.

Then the whole thing can be handled in the in order part: decode just has to communicate back to the fetch engine when it decodes the "set count" instruction, which would be have to be associated with the relevant branch in some way. Being several pipeline stages later (and also because fetch may run fairly far ahead), the fetch engine might have already gone the wrong way once the "set count" instruction is decoded, but that's just a FE bubble rather than a more costly issued misprediction.

Then the BP just has to tag this branch as being under the control of the counter. Perhaps this is even memorized like traditional branch prediction so next time you see that IP you can even avoid the fetch bubble (but you wouldn't want to waste general-purpose BP resources on that).

It's when you try to make this counter a dynamic value, e.g., writing by an instruction which takes a GP register that things get a lot messier.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
LLVM comments on mem*Maynard Handley2020/01/14 01:51 PM
  LLVM comments on mem*Anon32020/01/15 06:28 AM
  Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/15 07:12 AM
    Interesting comment about rep instructions & code sizenone2020/01/15 08:59 AM
      Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 03:56 AM
        Interesting comment about rep instructions & code sizeLinus Torvalds2020/01/16 10:12 AM
          ISA support for constant count loopsPaul A. Clayton2020/01/16 11:28 AM
            ISA support for constant count loopsGabriele Svelto2020/01/16 02:15 PM
              PowerPC "front-end registers"Paul A. Clayton2020/01/16 03:34 PM
              ISA support for constant count loopsTravis Downs2020/01/16 05:21 PM
                ISA support for constant count loopsLinus Torvalds2020/01/16 08:41 PM
                  ISA support for constant count loopsTravis2020/01/16 09:48 PM
                    ISA support for constant count loopsBrett2020/01/17 01:28 AM
              Branch to CTRMaya2020/01/18 08:15 AM
                Branch to CTRGabriele Svelto2020/01/18 01:14 PM
            ISA support for constant count loopsanon2020/01/17 08:28 AM
              ISA support for constant count loopsTravis Downs2020/01/17 08:34 AM
            ISA support for constant count loops: ineffective compared to micro-threads2020/01/20 08:02 AM
              ISA support for constant count loops: ineffective compared to micro-threadssomeone2020/01/20 12:23 PM
                ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 09:23 AM
              ISA support for too slow computersEtienne2020/01/21 02:42 AM
                ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 09:18 AM
                  ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 10:04 AM
                  ISA support for constant count loops: ineffective compared to micro-threadsHeikki Kultala2020/01/22 10:47 AM
                    ISA support for constant count loops: ineffective compared to micro-threadsdmcq2020/01/22 01:31 PM
                    ISA support for constant count loops: ineffective compared to micro-threads2020/01/22 03:28 PM
                      ISA support for constant count loops: ineffective compared to micro-threadsEtienne2020/01/22 04:35 PM
          Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 02:00 PM
    Interesting comment about rep instructions & code sizeTravis Downs2020/01/15 03:40 PM
      Interesting comment about rep instructions & code sizeChester2020/01/15 05:16 PM
        Interesting comment about rep instructions & code sizeTravis Downs2020/01/15 05:50 PM
          Interesting comment about rep instructions & code sizeChester2020/01/15 07:24 PM
            Interesting comment about rep instructions & code sizeTravis Downs2020/01/16 02:26 PM
              Interesting comment about rep instructions & code sizeChester2020/01/17 01:16 PM
                Interesting comment about rep instructions & code sizeTravis Downs2020/01/17 03:41 PM
        Interesting comment about rep instructions & code sizeGabriele Svelto2020/01/16 03:53 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell purple?