Branch/jump target prediction

By: Megol (, August 19, 2016 6:42 am
Room: Moderated Discussions
Linus Torvalds ( on August 10, 2016 5:14 pm wrote:
> Megol ( on August 10, 2016 3:25 pm wrote:
> >
> > Can't argue against that, however the problems were mostly elsewhere.
> I agree that a lot of P4 weaknesses were exacerbated by other issues, and that
> the legacy decoders were too weak. But the legacy decoders were too weak partly
> because people had thought that trace caches were a good idea. They aren't.
> > > And it has almost nothing in common with the crap that was the P4 trace cache.
> >
> > So why mention it?
> .. because the predecode cache is the correct way to do this, and makes the trace cache pointless.

No it doesn't as it doesn't solve the same problem.

> So the BSD is very much relevant to the discussion - as a "look, here's something that actually
> works better, and that Intel does that largely replaces the broken trace cache".
> > What kind of workloads do you run where instruction cache coherency is problematic?
> Umm. Like almost all of them?
> Do you realize how bad the P4 was at coherency? To the point that compiler-generated
> code that didn't actually do self-modifying things at all had huge problems,
> just because the coherence "solution" that Intel picked sucked.

Okay I misunderstood. While data instruction cache coherency is indeed a problem in the P4 it is most commonly referred to as a problem for self-modifying code. What is commonly called coherency is keeping caches on different processors/cores updated.

> Yeah, it's less of an issue on architectures that don't actually need coherency in the first place,
> but that wasn't what was discussed. What was claimed was that the P4 trace cache was "awesome".
> It really really wasn't.
> > Really...
> Really. Trust me. Compilers had to be changed because of it.

Strange that one have to change compilers to run better on newer processors, never happened before I say.

> Yes, you can argue that that was due to another bad implementation issue, but the oddity comes almost directly
> from the fact that coherence gets more complicated, so then you do odd/bad things to simplify the problems.
> So the coherency issues were pretty much caused by the trace cache. The
> fact is, trace caches need more care and complexity in this area.


> > Most branches _are_ very predictable, for those that aren't -> don't create a trace.
> >Fixed.
> Bullshit.
> You don't know which branches are predictable to begin with. Also, even the "very
> predictable" ones tend to be about 99%, which isn't actually that predictable
> after all - it causes problems when you end up having code overlap anyway.

Not knowing which branches are predictable isn't a problem - delay trace creation until one does is trivial in a good frontend design. The P4 couldn't do that given it's extremely narrow legacy decoder. But I have never been talking about using the P4 as a model.

> And btw, those benchmarks that show how predictable branches are? Yeah, they
> aren't really all that indicative of real code that people actually run.

Like GCC? Do you realize it is trivial to use performance counters to see how many branch mispredicts real world code have? Most branches are predicted well for real world code.

> It all boils down to the fact that you basically need to have the non-trace-cache case execute pretty
> much as quickly as the trace case, and the whole trace cache ends up being a lot of complexity for
> very little advantage. You can't actually try to skimp on the "legacy" decoders after all.

Congratulations! Perhaps you will someday understand that saying trace caches aren't a bad concept isn't the same thing as saying the P4 had a good front-end. It hadn't and it isn't really relevant to this thread that doesn't start out as Pentium 4 worship!

> So you're much better off doing just a L0 I$ predecoded cache on an
> instruction boundary level, and forget entirely about the traces.

That solves a mostly different problem than the trace cache.

> > Perhaps you should look up what a trace cache is before stating things like that?
> Yeah, let's just imagine that I worked for a company that did very
> similar things and actually generated traces on real loads.

Similar things sure. But doing trace scheduling isn't the same thing as doing trace caches.

Trace caches were created to increase fetch bandwidth for wide superscalar processors for realistic real-world code where branches are common. It is an alternative to things like multi-way branch predictors, collapsing instruction buffers etc.

> In other words, I haven't just masturbated over academic
> papers like you apparently do. I do know how they work.

It is obvious that you don't know how they work - you don't even understand why they were created in the first place!

If reading academic papers, verifying that they measure the correct things and building ones understanding on that is masturbation (I interpret that as "fucking around without real results" as any other would be puerile) I wonder what we should call the act of not understanding a topic, incorrectly thinking one have experience in the area and then in an act of ego-stroking loudly call the world to see this "expertise"?

Doing a Linus Torvalds perhaps?

> They suck.


I thought I had replied to this a long time ago, can't see it so either it was a failed posting or aggressive enough to warrant moderation [unlikely].
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Branch/jump target predictionTravis2016/08/09 09:44 AM
  Early decode of unconditional jumpsPeter Cordes2016/08/09 11:35 AM
    Early decode of unconditional jumpsExophase2016/08/09 12:29 PM
  pipelines are too long, noHeikki Kultala2016/08/09 11:37 AM
    pipelines are too long, nono name2016/08/09 06:17 PM
      pipelines are too long, noWilco2016/08/10 01:43 AM
        pipelines are too long, noPaul A. Clayton2016/08/10 07:44 PM
    Converged BTB/IcachePaul A. Clayton2016/08/10 07:44 PM
  Branch/jump target predictionsylt2016/08/10 02:27 AM
    Branch/jump target predictionPeter Cordes2016/08/12 03:23 PM
      Branch/jump target predictionsylt2016/08/12 10:35 PM
  Branch/jump target predictionMr. Camel2016/08/10 09:43 AM
    Branch/jump target predictionLinus Torvalds2016/08/10 11:46 AM
      Branch/jump target predictionMegol2016/08/10 02:25 PM
        Branch/jump target predictionLinus Torvalds2016/08/10 04:14 PM
          Branch/jump target predictionDavid Kanter2016/08/11 11:09 PM
            Branch/jump target predictionLinus Torvalds2016/08/12 11:25 AM
          Branch/jump target prediction2016/08/14 04:24 AM
            Branch/jump target predictionMaynard Handley2016/08/14 06:47 AM
              Branch/jump target predictionDavid Kanter2016/08/14 07:13 AM
              Branch/jump target prediction2016/08/16 05:19 AM
            Branch/jump target predictionTim McCaffrey2016/08/14 07:12 AM
              Branch/jump target predictionDavid Kanter2016/08/14 07:18 AM
                Branch/jump target predictionGabriele Svelto2016/08/14 01:09 PM
            Just a thoughtAnon2016/08/14 09:40 AM
              Just a thought2016/08/16 05:58 AM
                Just a thoughtAnon2016/08/16 07:45 AM
                  Just a thought2016/08/16 08:36 AM
            Branch/jump target predictionLinus Torvalds2016/08/14 09:40 AM
              Branch/jump target prediction2016/08/16 05:40 AM
                Branch/jump target predictionRicardo B2016/08/16 06:39 AM
                  Branch/jump target prediction -82016/08/16 08:23 AM
                    Branch/jump target prediction -8anon2016/08/16 09:09 AM
                    Branch/jump target prediction -8Ricardo B2016/08/16 09:33 AM
                      Branch/jump target prediction -8Exophase2016/08/16 10:02 AM
                        Branch/jump target prediction -8Ricardo B2016/08/16 10:31 AM
                        SPU hbr instruction (hint for branch)vvid2016/08/16 11:31 AM
                        Branch/jump target prediction -8no name2016/08/17 07:16 AM
                    Branch/jump target prediction -8Gabriele Svelto2016/08/16 10:46 AM
                      Branch/jump target prediction -8Etienne2016/08/17 12:27 AM
                        Branch/jump target prediction -8Gabriele Svelto2016/08/17 02:52 AM
                    Branch/jump target prediction -8Maynard Handley2016/08/18 09:02 AM
                      Branch/jump target prediction -82016/08/18 05:21 PM
                        Branch/jump target prediction -8Maynard Handley2016/08/18 06:27 PM
                          Branch/jump target prediction -8Megol2016/08/19 03:29 AM
                          Part 1/N - CPU-internal JIT2016/08/19 03:44 AM
                        Atom, you're such a comedian.Jim Trent2016/08/18 09:39 PM
                          Atom, you're such a comedian.2016/08/19 02:23 AM
                      Branch/jump target prediction -8Etienne2016/08/19 12:25 AM
                        Branch/jump target prediction -8Simon Farnsworth2016/08/19 03:17 AM
                          Branch/jump target prediction -8Michael S2016/08/19 05:39 AM
                          Branch/jump target prediction -8anon2016/08/19 06:29 AM
                            Branch/jump target prediction -8Simon Farnsworth2016/08/19 07:34 AM
                              Branch/jump target prediction -8anon2016/08/19 07:48 AM
                                Branch/jump target prediction -8Exophase2016/08/19 10:03 AM
                                Branch/jump target prediction -8Maynard Handley2016/08/19 10:34 AM
                            Branch/jump target prediction -8David Kanter2016/08/19 11:23 PM
                        Branch/jump target prediction -8Ricardo B2016/08/19 06:18 AM
                          Branch/jump target prediction -8Maynard Handley2016/08/19 07:41 AM
                            Branch/jump target prediction -8Michael S2016/08/19 08:26 AM
                              Branch/jump target prediction -8Maynard Handley2016/08/19 12:47 PM
                                Branch/jump target prediction -8Michael S2016/08/21 12:53 AM
                                  Branch/jump target prediction -8Ricardo B2016/08/22 04:17 AM
                                    Branch/jump target prediction -8Michael S2016/08/22 04:58 AM
                                      Branch/jump target prediction -8Ricardo B2016/08/22 06:50 AM
                            Branch/jump target prediction -8Simon Farnsworth2016/08/19 08:28 AM
                              Branch/jump target prediction -8Simon Farnsworth2016/08/19 08:40 AM
                            Branch/jump target prediction -8David Kanter2016/08/22 11:05 PM
                              Branch/jump target prediction -8Maynard Handley2016/08/23 06:49 AM
                      Branch/jump target prediction -8anon2016/08/26 07:00 AM
                        Branch/jump target prediction -8anon2016/08/26 07:14 AM
                Branch/jump target predictionMegol2016/08/19 03:23 AM
          Branch/jump target predictionMegol2016/08/19 06:42 AM
            Branch/jump target predictionMaynard Handley2016/08/19 10:46 AM
              Branch/jump target predictionDavid Kanter2016/08/19 11:34 PM
                Branch/jump target predictionMaynard Handley2016/08/20 06:07 AM
            Branch/jump target predictionsylt2016/08/19 10:48 AM
              Branch/jump target predictionsylt2016/08/19 11:00 AM
              Branch/jump target predictionMegol2016/08/21 09:27 AM
                The (apparent) state of trace caches on modern CPUsMaynard Handley2016/08/22 02:10 PM
                  The (apparent) state of trace caches on modern CPUsExophase2016/08/22 07:55 PM
                    The (apparent) state of trace caches on modern CPUsanon2016/08/22 11:36 PM
                      The (apparent) state of trace caches on modern CPUsExophase2016/08/23 04:08 AM
                        The (apparent) state of trace caches on modern CPUsanon2016/08/23 08:51 PM
                          The (apparent) state of trace caches on modern CPUsExophase2016/08/23 10:12 PM
                          The (apparent) state of trace caches on modern CPUsMaynard Handley2016/08/24 06:38 AM
                            The (apparent) state of trace caches on modern CPUsanon2016/08/24 07:26 PM
                    The (apparent) state of trace caches on modern CPUsMaynard Handley2016/08/23 06:48 AM
                      That's not trueDavid Kanter2016/08/23 08:39 AM
                        That's not trueMaynard Handley2016/08/23 08:56 AM
                      The (apparent) state of trace caches on modern CPUsanon2016/08/23 08:54 PM
                  The (wrong) state of trace caches on modern CPUsEric Bron2016/08/25 01:38 AM
                    The (wrong) state of trace caches on modern CPUsMichael S2016/08/25 02:28 AM
                      The (wrong) state of trace caches on modern CPUsEric Bron2016/08/25 06:12 AM
                      The (wrong) state of trace caches on modern CPUsMaynard Handley2016/08/25 08:50 AM
                        The (wrong) state of trace caches on modern CPUsMichael S2016/08/25 09:36 AM
                          The (wrong) state of trace caches on modern CPUsExophase2016/08/25 10:32 AM
                        The (wrong) state of trace caches on modern CPUsEric Bron2016/08/25 10:12 AM
                          The (wrong) state of trace caches on modern CPUsMaynard Handley2016/08/25 11:01 AM
                            The (wrong) state of trace caches on modern CPUsEric Bron2016/08/25 11:20 AM
                              The (wrong) state of trace caches on modern CPUsMaynard Handley2016/08/25 12:34 PM
        Branch/jump target predictionGabriele Svelto2016/08/11 12:15 PM
  Branch/jump target predictionGabriele Svelto2016/08/20 06:21 AM
Reply to this Topic
Body: No Text
How do you spell tangerine? 🍊