By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), April 7, 2021 5:12 pm
Room: Moderated Discussions
Andrey (andrey.semashev.delete@this.gmail.com) on April 7, 2021 2:32 pm wrote:
>
> LL/SC is a very restricted version of HTM. You do have to retry in software if SC fails, although you
> normally don't implement a fallback path hoping that it will eventually succeed. Note that forward progress
> is not guaranteed with LL/SC. That is unlike regular atomics, which makes them superior.
All quality implementations of LL/SC do guarantee forward progress.
That guarantee is conditional on the LL/SC region being sufficiently small (the same way I'd suggest a transaction size be limited), but that obviously not an issue for the simple unconditional ALU operation cases that the atomics support.
At least alpha, arm and RISC-V have that guarantee architecturally. I'm pretty sure powerpc does too, but don't have the papers in front of me.
And the superiority of atomics is not about the forward guarantee - which ll/sc would have for that simple sequence anyway - but simply an efficiency issue. Atomics do have several advantages, not the least of which is that they are just simpler, and don't need all that pointless ll/sc complexity with conditionals and the whole hw/sw interaction for success.
Atomics can also much more naturally support more relaxed memory orderings, including actual remote accesses (although I'm not convinced anybody ever implemented that). The point being that they actually are better when their limitations work (ie for statistics and you just want to increment a word and you really don't care about what the value was before/after).
Linus
>
> LL/SC is a very restricted version of HTM. You do have to retry in software if SC fails, although you
> normally don't implement a fallback path hoping that it will eventually succeed. Note that forward progress
> is not guaranteed with LL/SC. That is unlike regular atomics, which makes them superior.
All quality implementations of LL/SC do guarantee forward progress.
That guarantee is conditional on the LL/SC region being sufficiently small (the same way I'd suggest a transaction size be limited), but that obviously not an issue for the simple unconditional ALU operation cases that the atomics support.
At least alpha, arm and RISC-V have that guarantee architecturally. I'm pretty sure powerpc does too, but don't have the papers in front of me.
And the superiority of atomics is not about the forward guarantee - which ll/sc would have for that simple sequence anyway - but simply an efficiency issue. Atomics do have several advantages, not the least of which is that they are just simpler, and don't need all that pointless ll/sc complexity with conditionals and the whole hw/sw interaction for success.
Atomics can also much more naturally support more relaxed memory orderings, including actual remote accesses (although I'm not convinced anybody ever implemented that). The point being that they actually are better when their limitations work (ie for statistics and you just want to increment a word and you really don't care about what the value was before/after).
Linus