By: Maynard Handley (name99.delete@this.name99.org), July 8, 2015 7:23 pm
Room: Moderated Discussions
Sylvain Collange (sylvain.collange.delete.delete@this.this.gmail.com) on July 8, 2015 10:32 am wrote:
> Maynard Handley (name99.delete@this.name99.org) on July 8, 2015 9:46 am wrote:
> > BTW, seeing Andre Seznec's name there, does any commercial
> > processor yet implement a PPM or TAGE-like predictor yet?
>
> I am not aware of any official statement about a commercial TAGE implementation.
>
> But comparing Haswell's performance counters with the output of a TAGE simulator, we observe
> comparable branch misprediction rates on average. (http://hal.inria.fr/hal-01100647/)
>
Inspired by this paper, I looked at the Geekbench3 Lua single core results (which I assume are basically an interpreter, and thus as good a proxy as one can hope for in figuring out this stuff). The results are very interesting.
A8 gets 1787/1.4 =1270 (score/frequency)
Sandy Bridge gets 4269/3.4=1255
Haswell gets 4325/3.3=1310
Nehalem gets 2284/3.2=713
(64-bit for everything except the Nehalem result where I could not find a 64-bit Window result. For some strange reason there are also no Broadwell 64-bit results yet; out of interest the 32-bit result is 2693/2.3=1.17, which perhaps we can take as indicating a 10% penalty for 32-bit mode, giving us some feel for what a 64-bit Nehalem result might be.)
It's merely a hint, not a proof, but it suggests that the intuition is correct (that is it gives the expected big jump in performance for an interpreter from Nehalem to Sandy Bridge). It also suggests that whatever Apple is using for their branch predictor it's pretty impressive. Perhaps not at the TAGE-ITTAGE level of Haswell (especially when we consider that they can make up for a few more branch mispredictions when they can run wider) but a very credible effort, at around the quality of the Sandy Bridge predictors; and presumably headed for something TAGE-like in the near future.
This also suggests (pace Linus' comment that for many purposes the only SPEC result worth paying attention to is the gcc score) that Lua maybe should play that same role for Geekbench?
> Maynard Handley (name99.delete@this.name99.org) on July 8, 2015 9:46 am wrote:
> > BTW, seeing Andre Seznec's name there, does any commercial
> > processor yet implement a PPM or TAGE-like predictor yet?
>
> I am not aware of any official statement about a commercial TAGE implementation.
>
> But comparing Haswell's performance counters with the output of a TAGE simulator, we observe
> comparable branch misprediction rates on average. (http://hal.inria.fr/hal-01100647/)
>
Inspired by this paper, I looked at the Geekbench3 Lua single core results (which I assume are basically an interpreter, and thus as good a proxy as one can hope for in figuring out this stuff). The results are very interesting.
A8 gets 1787/1.4 =1270 (score/frequency)
Sandy Bridge gets 4269/3.4=1255
Haswell gets 4325/3.3=1310
Nehalem gets 2284/3.2=713
(64-bit for everything except the Nehalem result where I could not find a 64-bit Window result. For some strange reason there are also no Broadwell 64-bit results yet; out of interest the 32-bit result is 2693/2.3=1.17, which perhaps we can take as indicating a 10% penalty for 32-bit mode, giving us some feel for what a 64-bit Nehalem result might be.)
It's merely a hint, not a proof, but it suggests that the intuition is correct (that is it gives the expected big jump in performance for an interpreter from Nehalem to Sandy Bridge). It also suggests that whatever Apple is using for their branch predictor it's pretty impressive. Perhaps not at the TAGE-ITTAGE level of Haswell (especially when we consider that they can make up for a few more branch mispredictions when they can run wider) but a very credible effort, at around the quality of the Sandy Bridge predictors; and presumably headed for something TAGE-like in the near future.
This also suggests (pace Linus' comment that for many purposes the only SPEC result worth paying attention to is the gcc score) that Lua maybe should play that same role for Geekbench?