Nehalem Architecture: Improvements Detailed

By: Henrik S (hstahl.delete@this.nospam.bea.nothanks.com), March 18, 2008 4:23 am
Room: Moderated Discussions
>>I don't think there are any - I think they just meant that
>>they made the old legacy instructions run faster, instead
>>of trying to introduce anything new.
>>
>>Which I really look forward to testing. The serialization
>>overhead of Core 2 is better than many other processors,
>>but everything else is so good that it still stands out
>>like a sore thumb. We have lots of kernel loads where one
>>of the biggest costs is just locking (even without any
>>nasty contention and cacheline ping-ping), because of how
>>it serializes the pipeline.
>
>That's something I'm also looking forward too. Even without contention acquiring
>locks is *painful*. Unless the data/code you're protecting takes a significant amount
>of time to process/execute you'll be bitten by the sheer cost of the lock/unlock
>couples so there is room for *lots* of improvement there.
>
>Gabriele

+1

It's not uncommon for Java workloads to waste 10% or more of the time processing uncontended locks, and I've seen up to ~30% in real-world apps(1).

The underlying reason is that many critical parts of the core Java library are synchronized (StringBuffer, HashTable, many I/O functions). While there are new APIs that avoid this (StringBuilder, HashMap, etc) there is lots of legacy code that uses the old APIs directly or indirectly.

JVMs usesoptimization tricks to avoid this (lock removal, lock elison, lazy unlocking etc) but that only serves to alleviate the problem, and doesn't entirely resolve it.

-- Henrik


(1) Measured as the increase in throughput when locks are forcefully disabled in JRockit (using -XXlazyunlocking or just hacking the JVM to not issue CAS instructions). The 30% number comes from a JSP-heavy app I ran into some time back. SPECjbb2005 gains ~10% by the use of -XXlazyunlocking.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Nehalem Architecture: Improvements Detailed Blut Aus Nord2008/03/17 02:52 PM
  Nehalem Architecture: Improvements Detailed bah2008/03/17 04:45 PM
    Nehalem Architecture: Improvements Detailed Linus Torvalds2008/03/17 06:14 PM
      Nehalem Architecture: Improvements Detailed Gabriele Svelto2008/03/18 01:11 AM
        Nehalem Architecture: Improvements Detailed Henrik S2008/03/18 04:23 AM
        Nehalem Architecture: Improvements Detailed Doug Siebert2008/03/18 09:48 PM
          Nehalem Architecture: Improvements Detailed anon2008/03/18 10:37 PM
            Nehalem Architecture: Improvements Detailed Doug Siebert2008/03/19 05:23 PM
          Nehalem Architecture: Improvements Detailed Ian Ollmann2008/03/19 08:15 AM
            SSE 4.2 Michael S2008/03/19 04:13 PM
              SSE 4.2 Ian Ollmann2008/03/20 09:56 AM
              SSE 4.2 anonymous2008/03/20 12:29 PM
                SSE 4.2 David W. Hess2008/03/21 07:24 AM
                  SSE 4.2 anonymous2008/03/22 07:27 AM
      CMPXCHG latencyDavid Kanter2008/03/28 05:59 PM
        CMPXCHG latencyanonymous coward2008/03/28 10:24 PM
          CMPXCHG latencyDavid Kanter2008/03/28 10:26 PM
            CMPXCHG latencyLinus Torvalds2008/03/29 11:43 AM
              CMPXCHG latencyDavid W. Hess2008/03/29 11:56 AM
              CMPXCHG latencyLinus Torvalds2008/03/29 02:17 PM
                CMPXCHG latencyGabriele Svelto2008/03/31 12:25 AM
                  CMPXCHG latencyMichael S2008/03/31 12:38 AM
                    CMPXCHG latencynick2008/03/31 12:52 AM
                      CMPXCHG latencyMichael S2008/03/31 01:51 AM
                        CMPXCHG latencyGabriele Svelto2008/03/31 02:08 AM
                        CMPXCHG latencynick2008/03/31 07:20 PM
                          CMPXCHG latencyMichael S2008/04/01 01:14 AM
                            CMPXCHG latencynick2008/04/01 02:34 AM
                    CMPXCHG latencyLinus Torvalds2008/03/31 10:16 AM
                      CMPXCHG latencyAaron Spink2008/03/31 07:15 PM
                        CMPXCHG latencynick2008/03/31 07:34 PM
                        CMPXCHG latencyLinus Torvalds2008/04/01 08:25 AM
                          CMPXCHG latencyZan2008/04/01 09:54 PM
                          CMPXCHG latencyZan2008/04/02 12:11 AM
                            CMPXCHG latencyLinus Torvalds2008/04/02 08:04 AM
                              CMPXCHG latencyZan2008/04/02 11:02 AM
                                CMPXCHG latencyLinus Torvalds2008/04/02 12:02 PM
                                  CMPXCHG latencyZan2008/04/02 04:15 PM
                      CMPXCHG latencyMichael S2008/04/01 01:26 AM
                        CMPXCHG latencyLinus Torvalds2008/04/01 07:08 AM
                CMPXCHG latency - Intel sourceWouter Tinus2008/04/02 12:36 PM
                  CMPXCHG latency - Intel sourceLinus Torvalds2008/04/02 02:21 PM
                    CMPXCHG latency - Intel sourceDavid Kanter2008/04/02 02:39 PM
    Nehalem Architecture: Improvements Detailed Philip Honermann2008/03/19 01:11 PM
      Nehalem Architecture: Improvements Detailed Linus Torvalds2008/03/19 01:43 PM
        CMPXCHG - all or nothingMichael S2008/03/19 03:49 PM
          multithreading - all or nothingno@thanks.com2008/03/19 05:17 PM
          CMPXCHG - all or nothingLinus Torvalds2008/03/19 05:21 PM
            CMPXCHG - all or nothingMichael S2008/03/20 06:38 AM
              CMPXCHG - all or nothingLinus Torvalds2008/03/20 08:45 AM
                CMPXCHG - all or nothingMichael S2008/03/21 07:08 AM
                  CMPXCHG - all or nothingLinus Torvalds2008/03/21 08:47 AM
            CMPXCHG - all or nothingHenrik S2008/03/20 10:09 AM
              CMPXCHG - all or nothingLinus Torvalds2008/03/20 10:53 AM
                CMPXCHG - all or nothingHenrik S2008/03/20 12:03 PM
                  CMPXCHG - all or nothingLinus Torvalds2008/03/20 01:12 PM
                    CMPXCHG - all or nothingHenrik S2008/03/21 12:13 AM
                      CMPXCHG - all or nothingGabriele Svelto2008/03/21 01:22 AM
        Nehalem Architecture: Improvements Detailed Philip Honermann2008/03/19 06:28 PM
          Nehalem Architecture: Improvements Detailed Linus Torvalds2008/03/19 07:42 PM
            Nehalem Architecture: Improvements Detailed Philip Honermann2008/03/20 06:03 PM
              Nehalem Architecture: Improvements Detailed Linus Torvalds2008/03/20 06:33 PM
                Nehalem Architecture: Improvements Detailed Philip Honermann2008/03/25 06:37 AM
                  Nehalem Architecture: Improvements Detailed Linus Torvalds2008/03/25 08:52 AM
                    What is DCAS? (NT)David Kanter2008/03/25 10:13 AM
                      Double compare-and-exchangeHenrik S2008/03/25 10:57 AM
                        Double compare-and-exchangeLinus Torvalds2008/03/25 11:38 AM
                          Double compare-and-exchangesavantu2008/03/25 01:54 PM
                            Double compare-and-exchangeLinus Torvalds2008/03/25 04:09 PM
                              Double compare-and-exchangeJamie Lucier2008/03/25 08:55 PM
                                Double compare-and-exchangesavantu2008/03/25 09:15 PM
                                  Double compare-and-exchangeHenrik S2008/03/26 08:40 AM
                                    Double compare-and-exchangeArun Ramakrishnan2008/03/27 02:07 AM
                                      Double compare-and-exchangeHenrik S2008/03/27 04:45 AM
                                  Surely GPL applies ?Richard Cownie2008/03/26 10:05 AM
                                    Surely GPL applies ?anon2008/03/26 02:58 PM
                                    Surely GPL applies ?Paul2008/03/26 05:01 PM
                                Double compare-and-exchangesomeone2008/03/25 09:18 PM
                                  Double compare-and-exchangeArun Ramakrishnan2008/03/27 02:03 AM
                                    Double compare-and-exchangesavantu2008/03/27 03:01 AM
                                      Double compare-and-exchangeArun Ramakrishnan2008/03/30 09:09 AM
                                        Double compare-and-exchangesavantu2008/03/30 09:59 AM
                                Double compare-and-exchangeLinus Torvalds2008/03/26 10:50 AM
                                  Double compare-and-exchangeanon2008/03/26 04:47 PM
                                  Double compare-and-exchangePaul2008/03/26 05:07 PM
                          Double compare-and-exchangeHoward Chu2008/03/25 05:18 PM
  Nehalem Architecture: Improvements Detailed Mr. Camel2008/03/17 08:50 PM
    Nehalem Architecture: Improvements Detailed anonymous2008/03/17 09:20 PM
  TFP will finally come :-)Paul A. Clayton2008/03/18 12:56 PM
  Nehalem Architecture: Improvements Detailed IntelUser20002008/03/27 07:46 PM
    Nehalem Architecture: Improvements Detailed David Kanter2008/03/27 10:21 PM
      Nehalem Architecture: Improvements Detailed nick2008/03/27 11:06 PM
        Nehalem Architecture: Improvements Detailed David Kanter2008/03/28 02:45 PM
          Nehalem Architecture: Improvements Detailed nick2008/03/28 07:52 PM
  L1 I-cachepuzzled2008/04/01 07:53 AM
    L1 I-cacheS. Rao2008/04/01 09:47 AM
    L1 I-cacherwessel2008/04/01 12:23 PM
    L1 I-cacheGabriele Svelto2008/04/03 12:30 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊