By: Patrick Chase (patrickjchase.delete@this.gmail.com), July 9, 2015 12:44 pm
Room: Moderated Discussions
David Kanter (dkanter.delete@this.realworldtech.com) on July 9, 2015 7:20 am wrote:
> You realize that Apple's branch predictor has 2.5-3X longer to make predictions than
> Intel's, right? Since most table accesses tend to grow like log(size), that means their
> prediction tables can be vastly larger.
Ah yes, I'd forgotten about Cyclone's high mispredict penalty as measured in wall-clock time.
If Apple's LLVM commits are to be believed then Cyclone's mispredict penalty is 16-19 clocks, or 11-13 nsec at its fastest shipping clock rate of 1.5 GHz.
Haswell's mispredict penalty is 15-20 clocks, or 4-5 nsec at its fastest shipping clock rate of 4 GHz.
In other words, Haswell has to produce predictions 2-3X as quickly.
> You realize that Apple's branch predictor has 2.5-3X longer to make predictions than
> Intel's, right? Since most table accesses tend to grow like log(size), that means their
> prediction tables can be vastly larger.
Ah yes, I'd forgotten about Cyclone's high mispredict penalty as measured in wall-clock time.
If Apple's LLVM commits are to be believed then Cyclone's mispredict penalty is 16-19 clocks, or 11-13 nsec at its fastest shipping clock rate of 1.5 GHz.
Haswell's mispredict penalty is 15-20 clocks, or 4-5 nsec at its fastest shipping clock rate of 4 GHz.
In other words, Haswell has to produce predictions 2-3X as quickly.