Article: Haswell TM Alternatives

Article: Haswell Transactional Memory Alternatives
By: David Kanter (, August 22, 2012 3:45 pm
Room: Moderated Discussions
bakaneko (nyan.delete@this.hyan.wan) on August 22, 2012 3:02 pm wrote:
> David Kanter ( on August 21, 2012 10:17 pm
> wrote:
> > We previously theorized that Intel’s TSX extensions in Haswell
> use the caches
> > to provide transactional memory semantics. This article
> describes an alternative
> > approach based on minimal changes to the CPU core
> (specifically in the ROB and
> > MOB), contrasts the advantages of the two
> techniques and discusses the expected
> > implementation in Haswell.
> >
> >
> >
> > I
> > also
> muse a bit about when these two techniques (cache-based and MOB-based TM)
> >
> will get implemented on the roadmap and how they can work together in a very
> > complimentary fashion.
> >
> > As always comments and discussion
> welcome.
> >
> > David
> Mhm, I don't get what is so important here. The
> question
> where to keep the old and new values (L2, L1D, MOB, other
> buffers)
> comes from the microarchitecture. So the old values
> into L2 of the core with
> the transaction and the new values
> into the L1D/MOB/local buffer, depending on
> the amount of
> expected data beyond what can be kept back in the MOB.

It's a microarchitectural detail, but it has significant implications. The L1 is heavily limited by associativity, whereas the MOB is more or less fully associative. Programmers really don't want to worry about associativity, because they almost always believe that any sort of cache is fully associative.

> I don't
> understand how the MOB would have to do more in such
> a scenario, and I don't
> see how important pushing everything
> into the MOB is in the first place. That's
> at least my naive
> technical opinion forgetting all the little details.

I think you're talking about a hybrid TM that uses both caches and MOB. I was primarily discussing how a MOB-only system would work.

> But
> there are other problems: I can't measure transactions.

You can measure TX failure using the fallback path.

> How will changes in the
> microarchitecture change the
> behaviour of programs which use them outside the
> transaction
> size?

Suppose you had a TX that required simultaneously accessing 5 variables that map to the same set. In a scheme that just used a 4-way L1, it would always fail. OTOH, the same TX would be able to succeed on a MOB-based implementation.

> And how long-lived are transactions in sight of
> more
> cooperative mechanisms? Threads which work on the same
> memory always
> cooperate, so models which support better
> cooperation are necessary anyway.
> (Not that I know any.)

I think ~70 memory accesses is about right for things like nice data structures.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Article: Haswell TM AlternativesDavid Kanter08/21/12 10:17 PM
  Article: Haswell TM AlternativesHåkan Winbom08/22/12 12:52 AM
    Article: Haswell TM AlternativesDavid Kanter08/22/12 02:06 AM
  Article: Haswell TM Alternativesanon08/22/12 09:46 AM
    Article: Haswell TM AlternativesLinus Torvalds08/22/12 10:16 AM
      Article: Haswell TM AlternativesDoug S08/24/12 09:34 AM
    AMD's ASF even more limitedPaul A. Clayton08/22/12 10:20 AM
      AMD's ASF even more limitedLinus Torvalds08/22/12 10:41 AM
        Compiler use of ll/sc?Paul A. Clayton08/28/12 10:28 AM
          Compiler use of ll/sc?Linus Torvalds09/08/12 01:58 PM
            Lock recognition?Paul A. Clayton09/10/12 02:17 PM
              Sorry, I was confusedPaul A. Clayton09/13/12 11:56 AM
  Filter to detect store conflictsPaul A. Clayton08/22/12 10:19 AM
  Article: Haswell TM Alternativesbakaneko08/22/12 03:02 PM
    Article: Haswell TM AlternativesDavid Kanter08/22/12 03:45 PM
      Article: Haswell TM Alternativesbakaneko08/22/12 10:56 PM
  Cache line granularity?Paul A. Clayton08/28/12 10:28 AM
    Cache line granularity?David Kanter08/31/12 09:13 AM
      A looser definition might have advantagesPaul A. Clayton09/01/12 07:29 AM
    Cache line granularity?rwessel08/31/12 08:54 PM
      Alpha load locked granularityPaul A. Clayton09/01/12 07:29 AM
        Alpha load locked granularityanon09/02/12 06:23 PM
          Alpha pages groupsPaul A. Clayton09/03/12 05:16 AM
  An alternative implementationMaynard Handley11/20/12 10:52 PM
    An alternative implementationbakaneko11/21/12 06:52 AM
      Guarding unread values?Paul A. Clayton11/21/12 09:39 AM
        Guarding unread values?bakaneko11/21/12 12:25 PM
    TM granularity and versioningPaul A. Clayton11/21/12 09:27 AM
      TM granularity and versioningMaynard Handley11/21/12 11:52 AM
        Indeed, TM (and coherence) has devilish details (NT)Paul A. Clayton11/21/12 11:56 AM
Reply to this Topic
Body: No Text
How do you spell green?