By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), August 22, 2012 9:16 am
Room: Moderated Discussions
anon (anon.delete@this.anon.com) on August 22, 2012 9:46 am wrote:
>
> If only using the MOB/ROB, then the size of the
> TM working sets will be rather limited, basically a slightly bigger version of
> the LL/SC instruction sequence in RISCs. Wouldn't be that useful.
Oh, it can be useful. Much more so than LL/SC, because it would work with arbitrary compiled code, and it wouldn't be limited to a single reservation (which is what LL/SC tends to do), and not tied to a "smallest common denominator" transaction size like LL/SC.
So personally, I'd like to see the "small transactions only" model.
However, in order for that model to work well, you have to have really cheap failure modes, because you potentially fail much more often.
This is one of the reasons why I was arguing so strongly at one point (both here and to some Intel engineers) for the transaction model to use "success prediction" (possibly actually re-using the branch prediction hardware, but possibly with a separate predictor). Because part of "really cheap failure modes" is to not even bother trying if it's unlikely to work.
(Part of "really cheap failure modes" is also to not have to do prediction in software. All the hardware engineers seem to think that you can do prediction in software, but that's a total and utter disaster for so many reasons that it's not even funny).
And in many ways, small transactions are much easier to predict. Some of them you can predict by just looking at the instruction stream, without any dynamic prediction table at all. The simplest prediction model would involve just the CPU asking itself "is the transaction end instruction in my instruction queue" when it starts the begin-transaction instruction, and saying "No, screw that" if not.
And nice small transactions are where transactions really help the most. Lock ping pong for some big expensive thing is not cheap, but ping-ponging some hot lock just because you do some list addition to a hot list - that's relatively really expensive.
Linus
>
> If only using the MOB/ROB, then the size of the
> TM working sets will be rather limited, basically a slightly bigger version of
> the LL/SC instruction sequence in RISCs. Wouldn't be that useful.
Oh, it can be useful. Much more so than LL/SC, because it would work with arbitrary compiled code, and it wouldn't be limited to a single reservation (which is what LL/SC tends to do), and not tied to a "smallest common denominator" transaction size like LL/SC.
So personally, I'd like to see the "small transactions only" model.
However, in order for that model to work well, you have to have really cheap failure modes, because you potentially fail much more often.
This is one of the reasons why I was arguing so strongly at one point (both here and to some Intel engineers) for the transaction model to use "success prediction" (possibly actually re-using the branch prediction hardware, but possibly with a separate predictor). Because part of "really cheap failure modes" is to not even bother trying if it's unlikely to work.
(Part of "really cheap failure modes" is also to not have to do prediction in software. All the hardware engineers seem to think that you can do prediction in software, but that's a total and utter disaster for so many reasons that it's not even funny).
And in many ways, small transactions are much easier to predict. Some of them you can predict by just looking at the instruction stream, without any dynamic prediction table at all. The simplest prediction model would involve just the CPU asking itself "is the transaction end instruction in my instruction queue" when it starts the begin-transaction instruction, and saying "No, screw that" if not.
And nice small transactions are where transactions really help the most. Lock ping pong for some big expensive thing is not cheap, but ping-ponging some hot lock just because you do some list addition to a hot list - that's relatively really expensive.
Linus
Topic | Posted By | Date |
---|---|---|
Article: Haswell TM Alternatives | David Kanter | 2012/08/21 09:17 PM |
Article: Haswell TM Alternatives | Håkan Winbom | 2012/08/21 11:52 PM |
Article: Haswell TM Alternatives | David Kanter | 2012/08/22 01:06 AM |
Article: Haswell TM Alternatives | anon | 2012/08/22 08:46 AM |
Article: Haswell TM Alternatives | Linus Torvalds | 2012/08/22 09:16 AM |
Article: Haswell TM Alternatives | Doug S | 2012/08/24 08:34 AM |
AMD's ASF even more limited | Paul A. Clayton | 2012/08/22 09:20 AM |
AMD's ASF even more limited | Linus Torvalds | 2012/08/22 09:41 AM |
Compiler use of ll/sc? | Paul A. Clayton | 2012/08/28 09:28 AM |
Compiler use of ll/sc? | Linus Torvalds | 2012/09/08 12:58 PM |
Lock recognition? | Paul A. Clayton | 2012/09/10 01:17 PM |
Sorry, I was confused | Paul A. Clayton | 2012/09/13 10:56 AM |
Filter to detect store conflicts | Paul A. Clayton | 2012/08/22 09:19 AM |
Article: Haswell TM Alternatives | bakaneko | 2012/08/22 02:02 PM |
Article: Haswell TM Alternatives | David Kanter | 2012/08/22 02:45 PM |
Article: Haswell TM Alternatives | bakaneko | 2012/08/22 09:56 PM |
Cache line granularity? | Paul A. Clayton | 2012/08/28 09:28 AM |
Cache line granularity? | David Kanter | 2012/08/31 08:13 AM |
A looser definition might have advantages | Paul A. Clayton | 2012/09/01 06:29 AM |
Cache line granularity? | rwessel | 2012/08/31 07:54 PM |
Alpha load locked granularity | Paul A. Clayton | 2012/09/01 06:29 AM |
Alpha load locked granularity | anon | 2012/09/02 05:23 PM |
Alpha pages groups | Paul A. Clayton | 2012/09/03 04:16 AM |
An alternative implementation | Maynard Handley | 2012/11/20 09:52 PM |
An alternative implementation | bakaneko | 2012/11/21 05:52 AM |
Guarding unread values? | Paul A. Clayton | 2012/11/21 08:39 AM |
Guarding unread values? | bakaneko | 2012/11/21 11:25 AM |
TM granularity and versioning | Paul A. Clayton | 2012/11/21 08:27 AM |
TM granularity and versioning | Maynard Handley | 2012/11/21 10:52 AM |
Indeed, TM (and coherence) has devilish details (NT) | Paul A. Clayton | 2012/11/21 10:56 AM |