By: dmcq (dmcq.delete@this.fano.co.uk), July 7, 2015 2:38 pm
Room: Moderated Discussions
Linus Torvalds (torvalds.delete@this.linux-foundation.org) on July 6, 2015 3:59 pm wrote:
> Paul A. Clayton (paaronclayton.delete@this.gmail.com) on July 6, 2015 3:02 pm wrote:
> >
> > The compiler should chose a select instruction if the branch is unpredictable.
>
> That's the traditional rule, yes.
>
> I'm wondering if it will go away, though. I personally think it should.
>
> This is exactly the kind of thing that hardware is better at predicting dynamically than having
> compiler rules for. It's often really hard for a compiler to know whether something predicts well
> or not, because it can depend entirely on the load, and not so much on the code itself.
>
> And you could easily envision extending the branch prediction hardware to turn a strongly predicted
> conditional move into a true move in the front-end, and breaking the data dependency on the
> not-taken side (the same way predicted conditional branches break the dependencies). And for
> weakly predicted (or unpredicted) conditional moves, keep it as a conditional move.
>
> Of course, it can be pretty subtle. This is another area where both ARM and PowerPC screwed up their
> memory ordering model. Their ordering rules are actually different for conditional moves and for conditional
> branches. For both ARM and Power, a data dependency between two memory read operations implies an
> ordering (so you don't need to put a read barrier between loading a pointer and loading something
> off the pointer), but a control dependency does not (so you do need to put a memory barrier between
> loading a value and conditionally based on that value loading something else).
>
> So dynamically turning a conditional move into a predicated regular move can actually affect
> the memory ordering model and the hardware would have to be pretty careful about it.
>
> x86 doesn't have those insane memory ordering semantics. Loads are done in order (as far as software
> could tell - they do get re-ordered, but the semantics are guaranteed to be the same as if they were done
> in order), so it doesn't matter if the two accessed had a data or control dependency between them.
>
> Linus
I think ARM would have been better off without those rules at all and just depended on memory barrier instructions. I guess they were left in because of having to cope with systems like Linux which were written for x86 and don't properly describe for the hardware what is really required. It is a continuing overhead in the hardware they will have to unfortunately support for a long time. If the proper memory barrier instructions were in it would be easier for the hardware to optimize things. Though I guess it'll be a while before they can stick in a control bit saying the only dependencies are those imposed by the explicit barriers. These overheads should be confined to where independent processors interact which is in the operating system and restricted parts of user programs. They shouldn't be affecting the majority of the time. This business about control dependencies and conditional move is yet another example of why the operating system should just be fixed and not go on complaining trying to get all other processors just as f%&$ked up as x86.
> Paul A. Clayton (paaronclayton.delete@this.gmail.com) on July 6, 2015 3:02 pm wrote:
> >
> > The compiler should chose a select instruction if the branch is unpredictable.
>
> That's the traditional rule, yes.
>
> I'm wondering if it will go away, though. I personally think it should.
>
> This is exactly the kind of thing that hardware is better at predicting dynamically than having
> compiler rules for. It's often really hard for a compiler to know whether something predicts well
> or not, because it can depend entirely on the load, and not so much on the code itself.
>
> And you could easily envision extending the branch prediction hardware to turn a strongly predicted
> conditional move into a true move in the front-end, and breaking the data dependency on the
> not-taken side (the same way predicted conditional branches break the dependencies). And for
> weakly predicted (or unpredicted) conditional moves, keep it as a conditional move.
>
> Of course, it can be pretty subtle. This is another area where both ARM and PowerPC screwed up their
> memory ordering model. Their ordering rules are actually different for conditional moves and for conditional
> branches. For both ARM and Power, a data dependency between two memory read operations implies an
> ordering (so you don't need to put a read barrier between loading a pointer and loading something
> off the pointer), but a control dependency does not (so you do need to put a memory barrier between
> loading a value and conditionally based on that value loading something else).
>
> So dynamically turning a conditional move into a predicated regular move can actually affect
> the memory ordering model and the hardware would have to be pretty careful about it.
>
> x86 doesn't have those insane memory ordering semantics. Loads are done in order (as far as software
> could tell - they do get re-ordered, but the semantics are guaranteed to be the same as if they were done
> in order), so it doesn't matter if the two accessed had a data or control dependency between them.
>
> Linus
I think ARM would have been better off without those rules at all and just depended on memory barrier instructions. I guess they were left in because of having to cope with systems like Linux which were written for x86 and don't properly describe for the hardware what is really required. It is a continuing overhead in the hardware they will have to unfortunately support for a long time. If the proper memory barrier instructions were in it would be easier for the hardware to optimize things. Though I guess it'll be a while before they can stick in a control bit saying the only dependencies are those imposed by the explicit barriers. These overheads should be confined to where independent processors interact which is in the operating system and restricted parts of user programs. They shouldn't be affecting the majority of the time. This business about control dependencies and conditional move is yet another example of why the operating system should just be fixed and not go on complaining trying to get all other processors just as f%&$ked up as x86.