By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), April 7, 2021 10:35 am
Room: Moderated Discussions
sr (nobody.delete@this.nowhere.com) on April 7, 2021 9:42 am wrote:
>
> But there it is. Locking is nearly impossible to get right.
>
> Transactional memory isn't all about performance. It's locking done by hardware.
No.
Transactional memory doesn't make locking easier. Not at all.
Transactional memory makes locking much harder. It adds a lot of new cases, it adds all that fallback logic, and it adds a huge number of debugging problems.
Really. HTM is not the solution to "make locking easier".
The only thing HTM does that is at least not "more complex" is to take existing locking (which you still have to get right, and HTM did absolutely nothing to make that part easier), and then do automatic lock elision for it in order to make it possibly perform better.
Seriously.
You have been fooled by the promise of "it can make simple locking perform so well that you don't need to do anything else", and then the HTM proponents have used that as an argument that it makes "locking simpler". But that argument was fundamentally dishonest to begin with, and it depended purely on the underlying performance argument (and that performance argument hasn't ever been shown to be true in the first place).
See how that "it simplifies locking" argument is truly doubly false? It wasn't really true to begin with, and the thing it depended on hasn't been shown to be true either.
Honestly, good HTM is likely to work better with complex locking, because that's the case that is less likely to have those nasty conflicts that HTM has stumbled on so badly. But that's also the case where HTM has much less low-hanging fruit, and much less obvious wins. So that's the case where HTM needs to work really really well just to break even.
Again, I don't think anybody has ever shown that case to really work all that well.
Because absolutely nothing in HTM is about making locking itself simpler. Everything in HTM is about making things much more complex, but you can then try to hide that extra complexity by using HTM purely for simple lock elision situations, and praying that HTM then fixes the scalability issues that such simple locking will tend to result in.
So you are simply fundamentally wrong about transactional memory (and Paul Clayton was fundamentally wrong about it being like GC and "simplifying" anything for software). It really really really doesn't simplify anything at all for software, quite the reverse.
You have two options:
(a) use HTM with the transactional part exposed, and expose all the complexity of transactions, and handle the transaction failure cases explicitly, and aim to really use HTM fully for performance reasons.
(b) use HTM just for lock elision, without exposing new complexity at all, and hope it improves performance.
Those are your two major options. And notice how neither of them simplifies anything at all, and (a) in fact makes for a lot more complexity for SW (although that complexity may or may not be hidden in compilers and libraries and most people might not see it due to that).
And both cases are really purely about some theoretical performance advantage that hasn't actually ever been really shown to be pan out except in very special cases.
And it's worth noting that even (b), which supposedly doesn't add any actual complexity to software outside of the low-level locking libraries themselves, is the one that failed absolutely miserably, because of actual hardware bugs. Those hardware bugs caused a huge amount of pain and complexity to debug.
So even that promise has failed absolutely spectacularly.
Stop arguing from ignorance, sr.
Linus
>
> But there it is. Locking is nearly impossible to get right.
>
> Transactional memory isn't all about performance. It's locking done by hardware.
No.
Transactional memory doesn't make locking easier. Not at all.
Transactional memory makes locking much harder. It adds a lot of new cases, it adds all that fallback logic, and it adds a huge number of debugging problems.
Really. HTM is not the solution to "make locking easier".
The only thing HTM does that is at least not "more complex" is to take existing locking (which you still have to get right, and HTM did absolutely nothing to make that part easier), and then do automatic lock elision for it in order to make it possibly perform better.
Seriously.
You have been fooled by the promise of "it can make simple locking perform so well that you don't need to do anything else", and then the HTM proponents have used that as an argument that it makes "locking simpler". But that argument was fundamentally dishonest to begin with, and it depended purely on the underlying performance argument (and that performance argument hasn't ever been shown to be true in the first place).
See how that "it simplifies locking" argument is truly doubly false? It wasn't really true to begin with, and the thing it depended on hasn't been shown to be true either.
Honestly, good HTM is likely to work better with complex locking, because that's the case that is less likely to have those nasty conflicts that HTM has stumbled on so badly. But that's also the case where HTM has much less low-hanging fruit, and much less obvious wins. So that's the case where HTM needs to work really really well just to break even.
Again, I don't think anybody has ever shown that case to really work all that well.
Because absolutely nothing in HTM is about making locking itself simpler. Everything in HTM is about making things much more complex, but you can then try to hide that extra complexity by using HTM purely for simple lock elision situations, and praying that HTM then fixes the scalability issues that such simple locking will tend to result in.
So you are simply fundamentally wrong about transactional memory (and Paul Clayton was fundamentally wrong about it being like GC and "simplifying" anything for software). It really really really doesn't simplify anything at all for software, quite the reverse.
You have two options:
(a) use HTM with the transactional part exposed, and expose all the complexity of transactions, and handle the transaction failure cases explicitly, and aim to really use HTM fully for performance reasons.
(b) use HTM just for lock elision, without exposing new complexity at all, and hope it improves performance.
Those are your two major options. And notice how neither of them simplifies anything at all, and (a) in fact makes for a lot more complexity for SW (although that complexity may or may not be hidden in compilers and libraries and most people might not see it due to that).
And both cases are really purely about some theoretical performance advantage that hasn't actually ever been really shown to be pan out except in very special cases.
And it's worth noting that even (b), which supposedly doesn't add any actual complexity to software outside of the low-level locking libraries themselves, is the one that failed absolutely miserably, because of actual hardware bugs. Those hardware bugs caused a huge amount of pain and complexity to debug.
So even that promise has failed absolutely spectacularly.
Stop arguing from ignorance, sr.
Linus