STIPB performance hit

By: Travis Downs (, November 18, 2018 10:13 am
Room: Moderated Discussions
Looks like Intel hyperthreaded boxes that move to kernel 4.20 are about to take a big hit in performance (and here's some follow-up benchmarks).

This one is special since it's the first mitigation, that I'm aware of, that affects purely computational code that doesn't make any system calls or additional context switches, etc. The other mitigations largely all occurred at user-kernel boundaries. So you see a more or less across the board hit even on stuff that wasn't effected before.

As far as I know this mitigation only impacts indirect branch prediction, by "isolating" the predictors on sibling logical threads. A typical performance impact of 20%-40%, in code that isn't going to contain that many indirect branches, suggests to me that this mode simply disables the predictor, or at least cripples it in a way that is effectively disabled, so you take a mispredict for most indirect branches. Anything less aggressive than that I don't see having such a large impact.

For example, if you just statically partitioned the IBTB between the two logical threads, the impact would be much less, probably in the low single-digit %s. It is not entirely surprising that you can't simply partition the resources using microcode and the only option is to effectively disable it. That's part for the course for most microcode fixes that affect low-level hardcoded stuff like this (see the disabling of the LSD [1] and TSX).

Anyone have any insight on what STIPB actually does? I guess I can test it once I get the new kernel and firmware.

[1] This doesn't get discussed much, but I disabling this feature has an ongoing drain of something like 10 to 100 MW worldwide, or the energy consumption of a small city. That will only increase as SKX servers start to take over Broadwell and earlier in the cloud and on-premise.
