Nehalem Architecture: Improvements Detailed

By: Linus Torvalds (, March 19, 2008 7:42 pm
Room: Moderated Discussions
Philip Honermann ( on 3/19/08 wrote:
>There's a real difference when you work with lock-free
>algorithms. Then you don't have any deallocation-problems
>you'd have to solve with ugly garbage-collection.

Examples of anything where LL/SC is fundamentally any
different from load+cmpxchg?

If you are talking about taking advantage of (often
not necessarily architected) things like the fact that the
LL/SC is also safe when somebody else writes to words in
the same cacheline as the LL happened, yes, LL/SC can be

It's why x86 has cmpxchg8b/16b, after all, ie you often
want to effectively update and/or check two words at a

But I'd like to repeat: at least from the standpoint of
somebody who has to be portable across multiple different
architectures, neither x86 nor the LL/SC ones are really
ever the problem. There are much more problematic
approaches out there.

(And yes, my portability issues means that I'm probably
biased. Exactly because we have to support any random crap
architecture, we can't do any fancy things anyway
without abstracting them out at a higher level, and so
LL/SC never really is useful for anything fancier than
what you can trivially do with regular simple x86 atomics).

>> LL/SC is clean, but it really isn't any better than
>> having the x86 kind of locked instructions, and in
>> many ways is much worse (ie the x86 model gives you
>> atomicity wrt interrupts without the lock, and a
>> wide variety of simple operations with the lock).
>The LL-SC-Pair is usually so tight toghether in the
>instruction-stream that interrupts are extremely unlikely
>to happen in the meantime.

LL/SC is interrupt-safe (if an interrupt happens in between,
the SC will just fail), but usually at a higher cost
than just doing an unconditional increment (or whatever).
Sometimes to the degree that it's actually cheaper to
disable and re-enable interrupts than it is to do LL/SC.

But again, my least favorite architectures are the ones
that give you neither!

My point being, this is not an area where x86 falls down.
Others do. The x86 model doesn't seem to be practically
weaker than LL/SC.

The real reason people seem to dislike x86 seems to be not
about practical issues, but just the fact that it has so
many different atomic instructions - with just about
any regular arithmetic-logical op being able to be atomic,
plus a few extra ones just for atomic cmpxchg-like things.

Practically speaking, almost nobody actually uses LL/SC
directly (because you can't anyway, from HLLs), so it's
seldom a big issue. But people seem to think it's superior
because it's such a simple model.

(I used to think sw-filled TLB's were a great idea for
the same reason. Never mind that they obviously suck
horribly. "simple model" is not really always superior).

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Nehalem Architecture: Improvements Detailed Blut Aus Nord2008/03/17 02:52 PM
  Nehalem Architecture: Improvements Detailed bah2008/03/17 04:45 PM
    Nehalem Architecture: Improvements Detailed Linus Torvalds2008/03/17 06:14 PM
      Nehalem Architecture: Improvements Detailed Gabriele Svelto2008/03/18 01:11 AM
        Nehalem Architecture: Improvements Detailed Henrik S2008/03/18 04:23 AM
        Nehalem Architecture: Improvements Detailed Doug Siebert2008/03/18 09:48 PM
          Nehalem Architecture: Improvements Detailed anon2008/03/18 10:37 PM
            Nehalem Architecture: Improvements Detailed Doug Siebert2008/03/19 05:23 PM
          Nehalem Architecture: Improvements Detailed Ian Ollmann2008/03/19 08:15 AM
            SSE 4.2 Michael S2008/03/19 04:13 PM
              SSE 4.2 Ian Ollmann2008/03/20 09:56 AM
              SSE 4.2 anonymous2008/03/20 12:29 PM
                SSE 4.2 David W. Hess2008/03/21 07:24 AM
                  SSE 4.2 anonymous2008/03/22 07:27 AM
      CMPXCHG latencyDavid Kanter2008/03/28 05:59 PM
        CMPXCHG latencyanonymous coward2008/03/28 10:24 PM
          CMPXCHG latencyDavid Kanter2008/03/28 10:26 PM
            CMPXCHG latencyLinus Torvalds2008/03/29 11:43 AM
              CMPXCHG latencyDavid W. Hess2008/03/29 11:56 AM
              CMPXCHG latencyLinus Torvalds2008/03/29 02:17 PM
                CMPXCHG latencyGabriele Svelto2008/03/31 12:25 AM
                  CMPXCHG latencyMichael S2008/03/31 12:38 AM
                    CMPXCHG latencynick2008/03/31 12:52 AM
                      CMPXCHG latencyMichael S2008/03/31 01:51 AM
                        CMPXCHG latencyGabriele Svelto2008/03/31 02:08 AM
                        CMPXCHG latencynick2008/03/31 07:20 PM
                          CMPXCHG latencyMichael S2008/04/01 01:14 AM
                            CMPXCHG latencynick2008/04/01 02:34 AM
                    CMPXCHG latencyLinus Torvalds2008/03/31 10:16 AM
                      CMPXCHG latencyAaron Spink2008/03/31 07:15 PM
                        CMPXCHG latencynick2008/03/31 07:34 PM
                        CMPXCHG latencyLinus Torvalds2008/04/01 08:25 AM
                          CMPXCHG latencyZan2008/04/01 09:54 PM
                          CMPXCHG latencyZan2008/04/02 12:11 AM
                            CMPXCHG latencyLinus Torvalds2008/04/02 08:04 AM
                              CMPXCHG latencyZan2008/04/02 11:02 AM
                                CMPXCHG latencyLinus Torvalds2008/04/02 12:02 PM
                                  CMPXCHG latencyZan2008/04/02 04:15 PM
                      CMPXCHG latencyMichael S2008/04/01 01:26 AM
                        CMPXCHG latencyLinus Torvalds2008/04/01 07:08 AM
                CMPXCHG latency - Intel sourceWouter Tinus2008/04/02 12:36 PM
                  CMPXCHG latency - Intel sourceLinus Torvalds2008/04/02 02:21 PM
                    CMPXCHG latency - Intel sourceDavid Kanter2008/04/02 02:39 PM
    Nehalem Architecture: Improvements Detailed Philip Honermann2008/03/19 01:11 PM
      Nehalem Architecture: Improvements Detailed Linus Torvalds2008/03/19 01:43 PM
        CMPXCHG - all or nothingMichael S2008/03/19 03:49 PM
          multithreading - all or nothingno@thanks.com2008/03/19 05:17 PM
          CMPXCHG - all or nothingLinus Torvalds2008/03/19 05:21 PM
            CMPXCHG - all or nothingMichael S2008/03/20 06:38 AM
              CMPXCHG - all or nothingLinus Torvalds2008/03/20 08:45 AM
                CMPXCHG - all or nothingMichael S2008/03/21 07:08 AM
                  CMPXCHG - all or nothingLinus Torvalds2008/03/21 08:47 AM
            CMPXCHG - all or nothingHenrik S2008/03/20 10:09 AM
              CMPXCHG - all or nothingLinus Torvalds2008/03/20 10:53 AM
                CMPXCHG - all or nothingHenrik S2008/03/20 12:03 PM
                  CMPXCHG - all or nothingLinus Torvalds2008/03/20 01:12 PM
                    CMPXCHG - all or nothingHenrik S2008/03/21 12:13 AM
                      CMPXCHG - all or nothingGabriele Svelto2008/03/21 01:22 AM
        Nehalem Architecture: Improvements Detailed Philip Honermann2008/03/19 06:28 PM
          Nehalem Architecture: Improvements Detailed Linus Torvalds2008/03/19 07:42 PM
            Nehalem Architecture: Improvements Detailed Philip Honermann2008/03/20 06:03 PM
              Nehalem Architecture: Improvements Detailed Linus Torvalds2008/03/20 06:33 PM
                Nehalem Architecture: Improvements Detailed Philip Honermann2008/03/25 06:37 AM
                  Nehalem Architecture: Improvements Detailed Linus Torvalds2008/03/25 08:52 AM
                    What is DCAS? (NT)David Kanter2008/03/25 10:13 AM
                      Double compare-and-exchangeHenrik S2008/03/25 10:57 AM
                        Double compare-and-exchangeLinus Torvalds2008/03/25 11:38 AM
                          Double compare-and-exchangesavantu2008/03/25 01:54 PM
                            Double compare-and-exchangeLinus Torvalds2008/03/25 04:09 PM
                              Double compare-and-exchangeJamie Lucier2008/03/25 08:55 PM
                                Double compare-and-exchangesavantu2008/03/25 09:15 PM
                                  Double compare-and-exchangeHenrik S2008/03/26 08:40 AM
                                    Double compare-and-exchangeArun Ramakrishnan2008/03/27 02:07 AM
                                      Double compare-and-exchangeHenrik S2008/03/27 04:45 AM
                                  Surely GPL applies ?Richard Cownie2008/03/26 10:05 AM
                                    Surely GPL applies ?anon2008/03/26 02:58 PM
                                    Surely GPL applies ?Paul2008/03/26 05:01 PM
                                Double compare-and-exchangesomeone2008/03/25 09:18 PM
                                  Double compare-and-exchangeArun Ramakrishnan2008/03/27 02:03 AM
                                    Double compare-and-exchangesavantu2008/03/27 03:01 AM
                                      Double compare-and-exchangeArun Ramakrishnan2008/03/30 09:09 AM
                                        Double compare-and-exchangesavantu2008/03/30 09:59 AM
                                Double compare-and-exchangeLinus Torvalds2008/03/26 10:50 AM
                                  Double compare-and-exchangeanon2008/03/26 04:47 PM
                                  Double compare-and-exchangePaul2008/03/26 05:07 PM
                          Double compare-and-exchangeHoward Chu2008/03/25 05:18 PM
  Nehalem Architecture: Improvements Detailed Mr. Camel2008/03/17 08:50 PM
    Nehalem Architecture: Improvements Detailed anonymous2008/03/17 09:20 PM
  TFP will finally come :-)Paul A. Clayton2008/03/18 12:56 PM
  Nehalem Architecture: Improvements Detailed IntelUser20002008/03/27 07:46 PM
    Nehalem Architecture: Improvements Detailed David Kanter2008/03/27 10:21 PM
      Nehalem Architecture: Improvements Detailed nick2008/03/27 11:06 PM
        Nehalem Architecture: Improvements Detailed David Kanter2008/03/28 02:45 PM
          Nehalem Architecture: Improvements Detailed nick2008/03/28 07:52 PM
  L1 I-cachepuzzled2008/04/01 07:53 AM
    L1 I-cacheS. Rao2008/04/01 09:47 AM
    L1 I-cacherwessel2008/04/01 12:23 PM
    L1 I-cacheGabriele Svelto2008/04/03 12:30 AM
Reply to this Topic
Body: No Text
How do you spell tangerine? 🍊