By: David Kanter (dkanter.delete@this.realworldtech.com), September 24, 2010 12:05 am
Room: Moderated Discussions
Linus Torvalds (torvalds@linux-foundation.org) on 9/23/10 wrote:
---------------------------
>Paul A. Clayton (paaronclayton@gmail.com) on 9/23/10 wrote:
>>
>>I suspect in practice is rather difficult to evaluate at
>>this point. As far as I know, the only hardware TM
>>systems have been the Sun-internal (never sold--did anyone
>>outside of Sun even get to play with early prototypes?)
>>Rock and Azul Systems implementation--which has entry-price
>>and non-open development issues.
>
>Heh. And the x86 emulation processors from Transmeta, that
>I worked on for several years.
IIRC the "TM" in TMTA chips was the gated store buffer, which was limited to 32 stores. Is that what you are referring to?
>For JITted code, transactions work fine. Why? Because if
>you overflow the transaction size (or other limits: there
>are always things like physical IO that cannot be done
>speculatively), you just retranslate.
>
>For static code, you need static recovery. And that is
>quite expensive. It's been tried, btw. Look at the IA64
>static alias analysis hardware. Exact same deal. Suddenly
>you cannot afford to take any risks, because the cost of
>a mispredict is too high.
>
>It was a damn stupid idea in alias analysis, it's an even
>worse idea in transactional memory. All worthwhile loads
>are simply too dynamic to be handled well by static choices
>made at compile time.
>
>So yes, I do have personal experience with transactional
>memory, and know something about the issues. It's good for
>some things, but it's really bad for others. And I can
>pretty much guarantee that you need hardware support
>for the "dynamic" aspects in order to make it palatable in
>a general-purpose CPU.
>
>For Azul, I bet that they depend on the JIT nature of the
>code they run, and avoid the issues that way. Again, it's
>a very specialized use where transactional memory works
>fine. Also, the Java world has all those specialized atomics
>with thread-safe hash tables etc - and you can make those
>kinds of trivial "extended atomics" using transactional
>memory. It's just a few memory operations, after all, you
>end up using your fancy transactional memory just as an
>extended "load locked + store conditional".
I think Azul had something much simpler than TM. I think they are now porting to nehalem-ex, so they clearly don't need TM for some of their stuff.
>But in a general-purpose setup where you actually want to
>give good support to software and make the transactions
>easy to use in a generic way (rather than as a few
>library functions to do some trivial atomic sequence), you
>need more.
>
>I'm personally convinced that part of the "more" that you
>need is good hardware support for doing a transaction
>failure predictor. Basically identical to branch prediction,
>but predicting whether a transaction will fail, and not
>even doing the speculative arm. Exactly so that you don't
>get the horrible downside of static code that ends up
>always failing and sucking horribly for that case (again,
>think IA64 aliases, but think about how the failure case
>happens after hundreds of instructions when the transaction
>size ends up being too big - so the downside is huge)
Yes that's quite likely, although depends on TX size. The larger your TX, the more important it will be.
David
---------------------------
>Paul A. Clayton (paaronclayton@gmail.com) on 9/23/10 wrote:
>>
>>I suspect in practice is rather difficult to evaluate at
>>this point. As far as I know, the only hardware TM
>>systems have been the Sun-internal (never sold--did anyone
>>outside of Sun even get to play with early prototypes?)
>>Rock and Azul Systems implementation--which has entry-price
>>and non-open development issues.
>
>Heh. And the x86 emulation processors from Transmeta, that
>I worked on for several years.
IIRC the "TM" in TMTA chips was the gated store buffer, which was limited to 32 stores. Is that what you are referring to?
>For JITted code, transactions work fine. Why? Because if
>you overflow the transaction size (or other limits: there
>are always things like physical IO that cannot be done
>speculatively), you just retranslate.
>
>For static code, you need static recovery. And that is
>quite expensive. It's been tried, btw. Look at the IA64
>static alias analysis hardware. Exact same deal. Suddenly
>you cannot afford to take any risks, because the cost of
>a mispredict is too high.
>
>It was a damn stupid idea in alias analysis, it's an even
>worse idea in transactional memory. All worthwhile loads
>are simply too dynamic to be handled well by static choices
>made at compile time.
>
>So yes, I do have personal experience with transactional
>memory, and know something about the issues. It's good for
>some things, but it's really bad for others. And I can
>pretty much guarantee that you need hardware support
>for the "dynamic" aspects in order to make it palatable in
>a general-purpose CPU.
>
>For Azul, I bet that they depend on the JIT nature of the
>code they run, and avoid the issues that way. Again, it's
>a very specialized use where transactional memory works
>fine. Also, the Java world has all those specialized atomics
>with thread-safe hash tables etc - and you can make those
>kinds of trivial "extended atomics" using transactional
>memory. It's just a few memory operations, after all, you
>end up using your fancy transactional memory just as an
>extended "load locked + store conditional".
I think Azul had something much simpler than TM. I think they are now porting to nehalem-ex, so they clearly don't need TM for some of their stuff.
>But in a general-purpose setup where you actually want to
>give good support to software and make the transactions
>easy to use in a generic way (rather than as a few
>library functions to do some trivial atomic sequence), you
>need more.
>
>I'm personally convinced that part of the "more" that you
>need is good hardware support for doing a transaction
>failure predictor. Basically identical to branch prediction,
>but predicting whether a transaction will fail, and not
>even doing the speculative arm. Exactly so that you don't
>get the horrible downside of static code that ends up
>always failing and sucking horribly for that case (again,
>think IA64 aliases, but think about how the failure case
>happens after hundreds of instructions when the transaction
>size ends up being too big - so the downside is huge)
Yes that's quite likely, although depends on TX size. The larger your TX, the more important it will be.
David
Topic | Posted By | Date |
---|---|---|
T3 announced | Max | 2010/09/21 04:42 AM |
T3 announced | someone | 2010/09/21 05:53 AM |
T3 announced | anon | 2010/09/21 06:05 AM |
T3 announced | lurker | 2010/09/21 07:11 AM |
T3 announced | Jesper Frimann | 2010/09/21 07:21 AM |
T3 announced | Phil | 2010/09/22 12:59 AM |
T3 announced | Michael S | 2010/09/22 06:16 AM |
T3 announced | Linus Torvalds | 2010/09/21 07:15 AM |
T3 announced | anon | 2010/09/21 09:31 AM |
Transactional memory | Paul A. Clayton | 2010/09/21 10:52 AM |
Transactional memory | Linus Torvalds | 2010/09/21 12:21 PM |
Transactional memory | Paul A. Clayton | 2010/09/23 07:30 AM |
Transactional memory | Linus Torvalds | 2010/09/23 08:01 AM |
Transactional memory | David Kanter | 2010/09/24 12:05 AM |
Transactional memory | Linus Torvalds | 2010/09/24 07:59 AM |
Transactional memory | David Kanter | 2010/09/25 09:27 AM |
'dynamic fallback'? | Paul A. Clayton | 2010/09/25 11:28 AM |
'dynamic fallback'? | Linus Torvalds | 2010/09/25 01:23 PM |
'dynamic fallback'? | blaine | 2010/09/25 02:16 PM |
Cliff Click Jr. on Azul's HTM | Paul A. Clayton | 2010/09/24 02:19 PM |
Transactional memory | Foo_ | 2010/09/24 03:08 AM |
T3 announced | blaine | 2010/09/21 11:43 AM |
no news from Fujitsu | Max | 2010/09/21 10:37 PM |