The (wrong) state of trace caches on modern CPUs

By: Maynard Handley (name99.delete@this.name99.org), August 25, 2016 8:50 am
Room: Moderated Discussions
Michael S (already5chosen.delete@this.yahoo.com) on August 25, 2016 3:28 am wrote:
> Eric Bron (eric.bron.delete@this.zvisuel.privatefortest.com) on August 25, 2016 2:38 am wrote:
> > > "
> > > With the Intel Sandybridge [45] processor, Intel introduced
> > > a μop cache instead of a loop buffer. μop caches
> > > tradeoff some of the power efficiency of loop caches in exchange
> > > for capturing more instructions and behaviors.
> > > Thus codes which frequent and simple loops may be better
> > > served by a traditional loop cache, however μop caches
> > > are more robust and able to derive benefit more irregular
> > > codes. Essentially, μop caches operate as traditional
> > > caches which hold decoded instructions. However, they share
> > > some characteristics with loop caches. [b]In current
> > > commercial implementations, μop caches encode predicted branch
> > > paths.[/b] If branch paths differ from previously
> > > predicted paths, like loop caches the μop cache must be flushed and refilled.
> > > "
> >
> > well, the above excerpt contains roughly one error per
> > line, it looks like this PhD was not properly reviewed
> >
>
> His linkedin profile rather impressive for a young man.
> Including:
>
Engineering Intern
> Intel Corporation
> May 2010 – December 2010 (8 months)
>
> • Worked on the Haswell performance modeling team.
> • Performed early stage feature evaluation for a future generation processor..
> • Extensive C++ coding on an out of order processor simulator.

>
> As a non-native English speaker I see a grammar of above quote as problematic, but may be for natives
> it's o.k? Or, may be, there are typing mistakes, like "Thus codes which frequent" must be "Thus codes
> with frequent" or "derive benefit more irregular" should be "derive benefit for more irregular"
>
> But yes, SB has *both* Decoded ICache and ability to utilize Micro-op Queue as small loop cache.
> And yes, SB Decoded ICache stores uOps in program order rather than in predicted branch path order.
> And yes, when "branch paths differ from previously predicted paths" Decoded ICache is *not* flushed.
>
>
>

I have no idea what you are saying here, and what you are agreeing or disagreeing with.

** SB has *both* Decoded ICache and ability to utilize Micro-op Queue as small loop cache.
+ Sure, I don't think any sane person was disagreeing with that.

** SB Decoded ICache stores uOps in program order rather than in predicted branch path order.
+ I have no idea what the difference between these two statements is.
Think about the implementation.
How is the decoded ops cache going to work? The obvious implementation is that it stores decoded instructions AS THEY ARE ENCOUNTERED. In other words, they are stored in
predicted branch path order because that is what the decoder saw. Since predicted branch path order equals program order when the branch prediction is working, these are usually the same thing, but the decoded cache pretty much HAS to store them in the order that was generated by branch prediction.

And a PREDICTED straightline ordering of a stream of instructions is a trace, no? So what is being argued here?

** When "branch paths differ from previously predicted paths" Decoded ICache is *not* flushed.
OK. Do you know this for a fact? And if flushing does not occur, then why not? I have no basis on which to judge your credibility here, but what you're saying makes no sense. A large part of the point of the decoded ops cache is to avoid the expense of the branch prediction machinery. That means there isn't a SECOND layer of branch prediction machinery operating on the decoded ops, they're just run through straightline. And if that straightline run-through is no longer valid (because branch prediction suggests otherwise) then what is the point of keeping that, for lack of a better word, trace, in the cache.

Your mental model seems to be something like that the decoders scan the STATIC program code, translate it into some easier-to-interpret instructions, and then the complete standard CPU pipeline (including all the fetch logic with the branch steering that implies) runs on those easier-to-interpret instructions. But that's clearly not what happens, the whole system operates as I described, on the partial stream of instructions that has been delivered to the decoders, and with no branch steering mechanism that I can imagine available to the decoded cache.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Branch/jump target predictionTravis2016/08/09 09:44 AM
  Early decode of unconditional jumpsPeter Cordes2016/08/09 11:35 AM
    Early decode of unconditional jumpsExophase2016/08/09 12:29 PM
  pipelines are too long, noHeikki Kultala2016/08/09 11:37 AM
    pipelines are too long, nono name2016/08/09 06:17 PM
      pipelines are too long, noWilco2016/08/10 01:43 AM
        pipelines are too long, noPaul A. Clayton2016/08/10 07:44 PM
    Converged BTB/IcachePaul A. Clayton2016/08/10 07:44 PM
  Branch/jump target predictionsylt2016/08/10 02:27 AM
    Branch/jump target predictionPeter Cordes2016/08/12 03:23 PM
      Branch/jump target predictionsylt2016/08/12 10:35 PM
  Branch/jump target predictionMr. Camel2016/08/10 09:43 AM
    Branch/jump target predictionLinus Torvalds2016/08/10 11:46 AM
      Branch/jump target predictionMegol2016/08/10 02:25 PM
        Branch/jump target predictionLinus Torvalds2016/08/10 04:14 PM
          Branch/jump target predictionDavid Kanter2016/08/11 11:09 PM
            Branch/jump target predictionLinus Torvalds2016/08/12 11:25 AM
          Branch/jump target prediction2016/08/14 04:24 AM
            Branch/jump target predictionMaynard Handley2016/08/14 06:47 AM
              Branch/jump target predictionDavid Kanter2016/08/14 07:13 AM
              Branch/jump target prediction2016/08/16 05:19 AM
            Branch/jump target predictionTim McCaffrey2016/08/14 07:12 AM
              Branch/jump target predictionDavid Kanter2016/08/14 07:18 AM
                Branch/jump target predictionGabriele Svelto2016/08/14 01:09 PM
            Just a thoughtAnon2016/08/14 09:40 AM
              Just a thought2016/08/16 05:58 AM
                Just a thoughtAnon2016/08/16 07:45 AM
                  Just a thought2016/08/16 08:36 AM
            Branch/jump target predictionLinus Torvalds2016/08/14 09:40 AM
              Branch/jump target prediction2016/08/16 05:40 AM
                Branch/jump target predictionRicardo B2016/08/16 06:39 AM
                  Branch/jump target prediction -82016/08/16 08:23 AM
                    Branch/jump target prediction -8anon2016/08/16 09:09 AM
                    Branch/jump target prediction -8Ricardo B2016/08/16 09:33 AM
                      Branch/jump target prediction -8Exophase2016/08/16 10:02 AM
                        Branch/jump target prediction -8Ricardo B2016/08/16 10:31 AM
                        SPU hbr instruction (hint for branch)vvid2016/08/16 11:31 AM
                        Branch/jump target prediction -8no name2016/08/17 07:16 AM
                    Branch/jump target prediction -8Gabriele Svelto2016/08/16 10:46 AM
                      Branch/jump target prediction -8Etienne2016/08/17 12:27 AM
                        Branch/jump target prediction -8Gabriele Svelto2016/08/17 02:52 AM
                    Branch/jump target prediction -8Maynard Handley2016/08/18 09:02 AM
                      Branch/jump target prediction -82016/08/18 05:21 PM
                        Branch/jump target prediction -8Maynard Handley2016/08/18 06:27 PM
                          Branch/jump target prediction -8Megol2016/08/19 03:29 AM
                          Part 1/N - CPU-internal JIT2016/08/19 03:44 AM
                        Atom, you're such a comedian.Jim Trent2016/08/18 09:39 PM
                          Atom, you're such a comedian.2016/08/19 02:23 AM
                      Branch/jump target prediction -8Etienne2016/08/19 12:25 AM
                        Branch/jump target prediction -8Simon Farnsworth2016/08/19 03:17 AM
                          Branch/jump target prediction -8Michael S2016/08/19 05:39 AM
                          Branch/jump target prediction -8anon2016/08/19 06:29 AM
                            Branch/jump target prediction -8Simon Farnsworth2016/08/19 07:34 AM
                              Branch/jump target prediction -8anon2016/08/19 07:48 AM
                                Branch/jump target prediction -8Exophase2016/08/19 10:03 AM
                                Branch/jump target prediction -8Maynard Handley2016/08/19 10:34 AM
                            Branch/jump target prediction -8David Kanter2016/08/19 11:23 PM
                        Branch/jump target prediction -8Ricardo B2016/08/19 06:18 AM
                          Branch/jump target prediction -8Maynard Handley2016/08/19 07:41 AM
                            Branch/jump target prediction -8Michael S2016/08/19 08:26 AM
                              Branch/jump target prediction -8Maynard Handley2016/08/19 12:47 PM
                                Branch/jump target prediction -8Michael S2016/08/21 12:53 AM
                                  Branch/jump target prediction -8Ricardo B2016/08/22 04:17 AM
                                    Branch/jump target prediction -8Michael S2016/08/22 04:58 AM
                                      Branch/jump target prediction -8Ricardo B2016/08/22 06:50 AM
                            Branch/jump target prediction -8Simon Farnsworth2016/08/19 08:28 AM
                              Branch/jump target prediction -8Simon Farnsworth2016/08/19 08:40 AM
                            Branch/jump target prediction -8David Kanter2016/08/22 11:05 PM
                              Branch/jump target prediction -8Maynard Handley2016/08/23 06:49 AM
                      Branch/jump target prediction -8anon2016/08/26 07:00 AM
                        Branch/jump target prediction -8anon2016/08/26 07:14 AM
                Branch/jump target predictionMegol2016/08/19 03:23 AM
          Branch/jump target predictionMegol2016/08/19 06:42 AM
            Branch/jump target predictionMaynard Handley2016/08/19 10:46 AM
              Branch/jump target predictionDavid Kanter2016/08/19 11:34 PM
                Branch/jump target predictionMaynard Handley2016/08/20 06:07 AM
            Branch/jump target predictionsylt2016/08/19 10:48 AM
              Branch/jump target predictionsylt2016/08/19 11:00 AM
              Branch/jump target predictionMegol2016/08/21 09:27 AM
                The (apparent) state of trace caches on modern CPUsMaynard Handley2016/08/22 02:10 PM
                  The (apparent) state of trace caches on modern CPUsExophase2016/08/22 07:55 PM
                    The (apparent) state of trace caches on modern CPUsanon2016/08/22 11:36 PM
                      The (apparent) state of trace caches on modern CPUsExophase2016/08/23 04:08 AM
                        The (apparent) state of trace caches on modern CPUsanon2016/08/23 08:51 PM
                          The (apparent) state of trace caches on modern CPUsExophase2016/08/23 10:12 PM
                          The (apparent) state of trace caches on modern CPUsMaynard Handley2016/08/24 06:38 AM
                            The (apparent) state of trace caches on modern CPUsanon2016/08/24 07:26 PM
                    The (apparent) state of trace caches on modern CPUsMaynard Handley2016/08/23 06:48 AM
                      That's not trueDavid Kanter2016/08/23 08:39 AM
                        That's not trueMaynard Handley2016/08/23 08:56 AM
                      The (apparent) state of trace caches on modern CPUsanon2016/08/23 08:54 PM
                  The (wrong) state of trace caches on modern CPUsEric Bron2016/08/25 01:38 AM
                    The (wrong) state of trace caches on modern CPUsMichael S2016/08/25 02:28 AM
                      The (wrong) state of trace caches on modern CPUsEric Bron2016/08/25 06:12 AM
                      The (wrong) state of trace caches on modern CPUsMaynard Handley2016/08/25 08:50 AM
                        The (wrong) state of trace caches on modern CPUsMichael S2016/08/25 09:36 AM
                          The (wrong) state of trace caches on modern CPUsExophase2016/08/25 10:32 AM
                        The (wrong) state of trace caches on modern CPUsEric Bron2016/08/25 10:12 AM
                          The (wrong) state of trace caches on modern CPUsMaynard Handley2016/08/25 11:01 AM
                            The (wrong) state of trace caches on modern CPUsEric Bron2016/08/25 11:20 AM
                              The (wrong) state of trace caches on modern CPUsMaynard Handley2016/08/25 12:34 PM
        Branch/jump target predictionGabriele Svelto2016/08/11 12:15 PM
  Branch/jump target predictionGabriele Svelto2016/08/20 06:21 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊