Article: Haswell TM Alternatives

Article: Haswell Transactional Memory Alternatives
By: David Kanter (, August 22, 2012 2:45 pm
Room: Moderated Discussions
bakaneko (nyan.delete@this.hyan.wan) on August 22, 2012 3:02 pm wrote:
> David Kanter ( on August 21, 2012 10:17 pm
> wrote:
> > We previously theorized that Intel’s TSX extensions in Haswell
> use the caches
> > to provide transactional memory semantics. This article
> describes an alternative
> > approach based on minimal changes to the CPU core
> (specifically in the ROB and
> > MOB), contrasts the advantages of the two
> techniques and discusses the expected
> > implementation in Haswell.
> >
> >
> >
> > I
> > also
> muse a bit about when these two techniques (cache-based and MOB-based TM)
> >
> will get implemented on the roadmap and how they can work together in a very
> > complimentary fashion.
> >
> > As always comments and discussion
> welcome.
> >
> > David
> Mhm, I don't get what is so important here. The
> question
> where to keep the old and new values (L2, L1D, MOB, other
> buffers)
> comes from the microarchitecture. So the old values
> into L2 of the core with
> the transaction and the new values
> into the L1D/MOB/local buffer, depending on
> the amount of
> expected data beyond what can be kept back in the MOB.

It's a microarchitectural detail, but it has significant implications. The L1 is heavily limited by associativity, whereas the MOB is more or less fully associative. Programmers really don't want to worry about associativity, because they almost always believe that any sort of cache is fully associative.

> I don't
> understand how the MOB would have to do more in such
> a scenario, and I don't
> see how important pushing everything
> into the MOB is in the first place. That's
> at least my naive
> technical opinion forgetting all the little details.

I think you're talking about a hybrid TM that uses both caches and MOB. I was primarily discussing how a MOB-only system would work.

> But
> there are other problems: I can't measure transactions.

You can measure TX failure using the fallback path.

> How will changes in the
> microarchitecture change the
> behaviour of programs which use them outside the
> transaction
> size?

Suppose you had a TX that required simultaneously accessing 5 variables that map to the same set. In a scheme that just used a 4-way L1, it would always fail. OTOH, the same TX would be able to succeed on a MOB-based implementation.

> And how long-lived are transactions in sight of
> more
> cooperative mechanisms? Threads which work on the same
> memory always
> cooperate, so models which support better
> cooperation are necessary anyway.
> (Not that I know any.)

I think ~70 memory accesses is about right for things like nice data structures.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Article: Haswell TM AlternativesDavid Kanter2012/08/21 09:17 PM
  Article: Haswell TM AlternativesHåkan Winbom2012/08/21 11:52 PM
    Article: Haswell TM AlternativesDavid Kanter2012/08/22 01:06 AM
  Article: Haswell TM Alternativesanon2012/08/22 08:46 AM
    Article: Haswell TM AlternativesLinus Torvalds2012/08/22 09:16 AM
      Article: Haswell TM AlternativesDoug S2012/08/24 08:34 AM
    AMD's ASF even more limitedPaul A. Clayton2012/08/22 09:20 AM
      AMD's ASF even more limitedLinus Torvalds2012/08/22 09:41 AM
        Compiler use of ll/sc?Paul A. Clayton2012/08/28 09:28 AM
          Compiler use of ll/sc?Linus Torvalds2012/09/08 12:58 PM
            Lock recognition?Paul A. Clayton2012/09/10 01:17 PM
              Sorry, I was confusedPaul A. Clayton2012/09/13 10:56 AM
  Filter to detect store conflictsPaul A. Clayton2012/08/22 09:19 AM
  Article: Haswell TM Alternativesbakaneko2012/08/22 02:02 PM
    Article: Haswell TM AlternativesDavid Kanter2012/08/22 02:45 PM
      Article: Haswell TM Alternativesbakaneko2012/08/22 09:56 PM
  Cache line granularity?Paul A. Clayton2012/08/28 09:28 AM
    Cache line granularity?David Kanter2012/08/31 08:13 AM
      A looser definition might have advantagesPaul A. Clayton2012/09/01 06:29 AM
    Cache line granularity?rwessel2012/08/31 07:54 PM
      Alpha load locked granularityPaul A. Clayton2012/09/01 06:29 AM
        Alpha load locked granularityanon2012/09/02 05:23 PM
          Alpha pages groupsPaul A. Clayton2012/09/03 04:16 AM
  An alternative implementationMaynard Handley2012/11/20 09:52 PM
    An alternative implementationbakaneko2012/11/21 05:52 AM
      Guarding unread values?Paul A. Clayton2012/11/21 08:39 AM
        Guarding unread values?bakaneko2012/11/21 11:25 AM
    TM granularity and versioningPaul A. Clayton2012/11/21 08:27 AM
      TM granularity and versioningMaynard Handley2012/11/21 10:52 AM
        Indeed, TM (and coherence) has devilish details (NT)Paul A. Clayton2012/11/21 10:56 AM
Reply to this Topic
Body: No Text
How do you spell tangerine? 🍊