By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), September 8, 2012 12:58 pm
Room: Moderated Discussions
[ was off traveling, sorry for late answer ]
Paul A. Clayton (paaronclayton.delete@this.gmail.com) on August 28, 2012 10:28 am wrote:
>
> How can a compiler recognize a locked critical
> section and not be able to recognize at least most of the possible uses of
> ll/sc? Recognizing a sequence that only loads one value from memory, performs
> some computation, then replaces the value with a new value (and that being the
> only store operation and--at least for most ISAs--the only memory operations
> allowed are the ll/sc [IIRC, Alpha also failed sc on a taken branch.])
>
> ll/sc
> is such a limited form of transactional memory that recognizing most possible
> uses would seem not to be too difficult.
Umm. It's not just the "load-op-store" sequence under a lock.
It's the lock itself!
Doing a spinlock around a load-op-store sequence is *not* something that is equivalent to doing the load-op-store as a ll/sc sequence. Yes, both are "atomic". No, they are not the same thing at all despite that.
In particular, another much longer sequence somewhere else may run under the spinlock, and depend on the fact that nothing that lock changes is changing - over long periods. It may do multiple loads of the value, and the lock guarantees that the value has to stay the same.
So no, a compiler can absolutely not change a locked atomic load-op-store sequence into a ll/sc sequence. The two have absolutely zero in common - they are both "atomic", but they are atomic in totally different ways.
So ll/sc is pretty much useless as anything else than a replacement for the CISC kind of "atomic increment/cmpxchg/whatever" instruction. It has absolutely nothing to do with transactional memory that can elide locks.
Don't confuse the two. They really have nothing in common, and "atomic" really means many different things.
Having a compiler recognize a lock, and turning it into a lock-elision sequence (that has the *semantics* of a lock) is trivial. In fact, the compiler doesn't even need to do it, you can just do the lock-eliding part as a function call or inline asm. Exactly because unlike ll/sc, it actually has the right semantics.
Linus
Paul A. Clayton (paaronclayton.delete@this.gmail.com) on August 28, 2012 10:28 am wrote:
>
> How can a compiler recognize a locked critical
> section and not be able to recognize at least most of the possible uses of
> ll/sc? Recognizing a sequence that only loads one value from memory, performs
> some computation, then replaces the value with a new value (and that being the
> only store operation and--at least for most ISAs--the only memory operations
> allowed are the ll/sc [IIRC, Alpha also failed sc on a taken branch.])
>
> ll/sc
> is such a limited form of transactional memory that recognizing most possible
> uses would seem not to be too difficult.
Umm. It's not just the "load-op-store" sequence under a lock.
It's the lock itself!
Doing a spinlock around a load-op-store sequence is *not* something that is equivalent to doing the load-op-store as a ll/sc sequence. Yes, both are "atomic". No, they are not the same thing at all despite that.
In particular, another much longer sequence somewhere else may run under the spinlock, and depend on the fact that nothing that lock changes is changing - over long periods. It may do multiple loads of the value, and the lock guarantees that the value has to stay the same.
So no, a compiler can absolutely not change a locked atomic load-op-store sequence into a ll/sc sequence. The two have absolutely zero in common - they are both "atomic", but they are atomic in totally different ways.
So ll/sc is pretty much useless as anything else than a replacement for the CISC kind of "atomic increment/cmpxchg/whatever" instruction. It has absolutely nothing to do with transactional memory that can elide locks.
Don't confuse the two. They really have nothing in common, and "atomic" really means many different things.
Having a compiler recognize a lock, and turning it into a lock-elision sequence (that has the *semantics* of a lock) is trivial. In fact, the compiler doesn't even need to do it, you can just do the lock-eliding part as a function call or inline asm. Exactly because unlike ll/sc, it actually has the right semantics.
Linus
Topic | Posted By | Date |
---|---|---|
Article: Haswell TM Alternatives | David Kanter | 2012/08/21 09:17 PM |
Article: Haswell TM Alternatives | Håkan Winbom | 2012/08/21 11:52 PM |
Article: Haswell TM Alternatives | David Kanter | 2012/08/22 01:06 AM |
Article: Haswell TM Alternatives | anon | 2012/08/22 08:46 AM |
Article: Haswell TM Alternatives | Linus Torvalds | 2012/08/22 09:16 AM |
Article: Haswell TM Alternatives | Doug S | 2012/08/24 08:34 AM |
AMD's ASF even more limited | Paul A. Clayton | 2012/08/22 09:20 AM |
AMD's ASF even more limited | Linus Torvalds | 2012/08/22 09:41 AM |
Compiler use of ll/sc? | Paul A. Clayton | 2012/08/28 09:28 AM |
Compiler use of ll/sc? | Linus Torvalds | 2012/09/08 12:58 PM |
Lock recognition? | Paul A. Clayton | 2012/09/10 01:17 PM |
Sorry, I was confused | Paul A. Clayton | 2012/09/13 10:56 AM |
Filter to detect store conflicts | Paul A. Clayton | 2012/08/22 09:19 AM |
Article: Haswell TM Alternatives | bakaneko | 2012/08/22 02:02 PM |
Article: Haswell TM Alternatives | David Kanter | 2012/08/22 02:45 PM |
Article: Haswell TM Alternatives | bakaneko | 2012/08/22 09:56 PM |
Cache line granularity? | Paul A. Clayton | 2012/08/28 09:28 AM |
Cache line granularity? | David Kanter | 2012/08/31 08:13 AM |
A looser definition might have advantages | Paul A. Clayton | 2012/09/01 06:29 AM |
Cache line granularity? | rwessel | 2012/08/31 07:54 PM |
Alpha load locked granularity | Paul A. Clayton | 2012/09/01 06:29 AM |
Alpha load locked granularity | anon | 2012/09/02 05:23 PM |
Alpha pages groups | Paul A. Clayton | 2012/09/03 04:16 AM |
An alternative implementation | Maynard Handley | 2012/11/20 09:52 PM |
An alternative implementation | bakaneko | 2012/11/21 05:52 AM |
Guarding unread values? | Paul A. Clayton | 2012/11/21 08:39 AM |
Guarding unread values? | bakaneko | 2012/11/21 11:25 AM |
TM granularity and versioning | Paul A. Clayton | 2012/11/21 08:27 AM |
TM granularity and versioning | Maynard Handley | 2012/11/21 10:52 AM |
Indeed, TM (and coherence) has devilish details (NT) | Paul A. Clayton | 2012/11/21 10:56 AM |