By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), July 7, 2015 9:23 am
Room: Moderated Discussions
Patrick Chase (patrickjchase.delete@this.gmai.com) on July 6, 2015 10:31 pm wrote:
>
> And yet you yourself have (very effectively, with real data) made the argument that cmov seldom pays on x86.
Absolutely. I think cmov a often a bad idea, because it leaves those data dependencies. And because it's often a bad idea, it's probably under-utilized in some cases (and also probably over-utilized in other cases).
And that's part of my point - I think it would be interesting if hardware turned it into a predicted move, exactly to remove the data dependencies when there is a strong reason to believe it's the right thing to do. That's exactly the kind of information the CPU branch predictor already has (well, most of them do - not just predicting which way a branch goes, but also how likely it is).
So hardware has the potential to offer the best of best worlds: keep the data dependency when it makes dynamic sense, and break it when it is likely the right thing to do.
That's the kind of choice you can make at a hardware level. Doing it at the software level is really really problematic, for all the reasons outlined.
See my argument?
That said, I also have to say that
(a) I think cmov on x86 has improved. It used to have pretty bad latencies, afaik they've improved. So you still do have the data dependencies, but for many cases it probably doesn't matter that much.
(b) there are clearly pretty big gotchas with using predictors too, and it may well be the case that it's not worth it. I wouldn't be surprised if this has been simulated, and real hw architects have come to the conclusion that the mispredicts just kill you.
(c) since people and compilers have been taught to try to avoid cmov for well-predicted stuff, and some of those judgments are probably quite correct, the existing use may well be skewed enough towards "unpredictable" that the upsides are even smaller.
So I'm certainly not claiming it's a no-brainer. I just think it would be interesting, and potentially something that hardware could do better (and it would allow software to maybe do better too, by making cmov more generically useful).
Linus
>
> And yet you yourself have (very effectively, with real data) made the argument that cmov seldom pays on x86.
Absolutely. I think cmov a often a bad idea, because it leaves those data dependencies. And because it's often a bad idea, it's probably under-utilized in some cases (and also probably over-utilized in other cases).
And that's part of my point - I think it would be interesting if hardware turned it into a predicted move, exactly to remove the data dependencies when there is a strong reason to believe it's the right thing to do. That's exactly the kind of information the CPU branch predictor already has (well, most of them do - not just predicting which way a branch goes, but also how likely it is).
So hardware has the potential to offer the best of best worlds: keep the data dependency when it makes dynamic sense, and break it when it is likely the right thing to do.
That's the kind of choice you can make at a hardware level. Doing it at the software level is really really problematic, for all the reasons outlined.
See my argument?
That said, I also have to say that
(a) I think cmov on x86 has improved. It used to have pretty bad latencies, afaik they've improved. So you still do have the data dependencies, but for many cases it probably doesn't matter that much.
(b) there are clearly pretty big gotchas with using predictors too, and it may well be the case that it's not worth it. I wouldn't be surprised if this has been simulated, and real hw architects have come to the conclusion that the mispredicts just kill you.
(c) since people and compilers have been taught to try to avoid cmov for well-predicted stuff, and some of those judgments are probably quite correct, the existing use may well be skewed enough towards "unpredictable" that the upsides are even smaller.
So I'm certainly not claiming it's a no-brainer. I just think it would be interesting, and potentially something that hardware could do better (and it would allow software to maybe do better too, by making cmov more generically useful).
Linus