And a hardware predicttor needs access to fallback timing

By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), April 3, 2021 2:22 pm
Room: Moderated Discussions
sr (nobody.delete@this.nowhere.com) on April 3, 2021 12:39 pm wrote:
>
> It's not about dirtying cachelines, it's about sharing critical section data.

sr, I understand transactional memory. Really. I do.

You're ignoring the big elephant in the room that I've pointed to over and over and over again: your theoretical arguments are pure and utter garbage in reality.

It's not "99.999% unmodified and only 0.001% data changes". That's your made-up dream world.

It's "real loads tend to actually change the same data".

I realize that you may have been fooled by various made-up benchmarks where you use some random distribution and "prove" that transactional memory works.

In real life, the access distribution isn't random. In real life, people hit on that same shared data all the time, and so you get a lot of conflicts. In real life, if you have that one big lock situation, then the locked section is often also so large that it won't fit in the transaction.

Just to make things really concrete, think about some "perfect" load for transactional memory, where you have hash lists, and you elide a lock around all the lookups and the modifications (because, let's face it, fine-grained locking is hard, particularly if you occasionally move elements around or have other ordering constraints).

Because the data elements are hashed, you expect that there will be very few actual data conflicts, and generally modifications are much less common than lookups anyway, so even if you end up touching the same data, it will mostly be a read-read that won't cause a conflict and an abort.

So this is basically the "wet dream" situation for transactional memory, where you expect to be able to do almost everything with nary a conflict in sight, and hopefully the operations you do on it are also small enough that you're not going to be anywhere near the capacity limits either.

And guess what? In real life, a lot of those loads also want to have basic statistics, and do things like success/fail ratios for how often the hash queue lookups actually hit in that hash, or just keep track of things like the queue length in order to dynamically resize the hash when required - etc etc.

And suddenly, those will all be globals that the lock you tried to elide protected, and the whole "a hash queue is distributed and we only touch 0.0001% of the data" is just a dream, because it turns out that every single lookup incremented that same single "success" counter.

Yeah, technically, that single counter is much much less than 0.001% of all the data. It's an absolutely miniscule part of your gigabytes of actual data. That doesn't help now, does it?

And then you realize that hey, not only did you have those globals, you also have a lot of false sharing, because any insertion onto a hash queue will change just one word, but it turns out that word was right next to the other seven hash queue pointer buckets that were in that same cacheline, and that first cacheline of the queue is also exactly the cacheline that all the lookups start out with.

So now that 0.001% - which was probably an optimistic and unrealistic number to begin with - was actually off by an order of magnitude just due to false sharing issues. But hey, hash queue modifications are rare, right? Yeah, sometimes they are. Not always.

And then you start actually doing real loads, and it really turns out that the access patterns really weren't even remotely random, and that everybody really is looking at the same hash bucket over and over again, because it turns out that certain data values are just much hotter and more common than others.

And it easily turns out that the hash lists themselves were embedded in data structures that were modified independently (and not part of the lookup logic), so now you have another source of false sharing right there - not with the lookups, but with the other threads that already looked things up and are now updating access counters for that entry, or whatever.

And you never saw those as cachelines bouncing back and forth before, because the single lock serialized them, so sure, the cacheline was moving around, but the concurrency was limited and HW prefetching actually worked and mostly hid it, and so it didn't actually show up as a huge problem. The lock showed up as a big issue in the profiles, but those kinds of incidentals didn't.

But suddenly, now that you're doing concurrent lookups, the reference counts or flag updates or whatever that you did as you successfully looked up an element end up bouncing that same cacheline which contained the the hash list you used for looking things up.

Because - after all - the whole point of lock elision was to take that code that wasn't designed for fine-grained locking, and just make it magically scale well. But the key here is really that "it wasn't designed for fine-grained locking". The data structures weren't designed to not have false sharing, or designed to not have those conflicts.

And yes, despite all these potential issues, under certain loads you never see any of them, and it all runs beautifully and like a bat out of hell.

And under other loads, you get these really hard to figure out aborts due to some odd issue that is really hard to debug, because you've gotten rid of all the statistics fields that caused you problems, and maybe you as a developer don't even have access to the load and the data that your customer is having performance problems with.

So stop with the theoretical "99.999% vs 0.001%".

If that was reality, then TSX would have taken the world by storm.

Hint: it didn't.

Linus
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
Armv9 officially announcedJon Masters2021/03/30 11:41 AM
  Armv9 officially announcedGabriele Svelto2021/03/30 01:27 PM
    HTM and TLEFoo_2021/03/30 01:31 PM
      HTM and TLEdmcq2021/03/30 03:03 PM
      Intel RTM and HLE is a successGanon2021/03/30 05:22 PM
        Intel RTM and HLE is a success (is it?)Foo_2021/03/31 01:16 AM
          Intel RTM and HLE was an abject failureanonymou52021/03/31 03:04 AM
          Intel RTM and HLE is a success (is it?)Andrey2021/03/31 05:27 AM
            Intel RTM and HLE is a success (is it?)Foo_2021/03/31 05:58 AM
              Intel RTM and HLE is a success (is it?)Andrey2021/03/31 07:45 AM
                Intel RTM and HLE is a success (is it?)Foo_2021/03/31 09:32 AM
                  Intel RTM and HLE is a success (is it?)Andrey2021/03/31 09:57 AM
                    Intel RTM and HLE is a success (is it?)anonymou52021/03/31 10:39 AM
                      ^ feel free to delete this one -- broken HTML there (NT)anonymou52021/03/31 10:40 AM
                      Intel RTM and HLE is a success (is it?)Andrey2021/03/31 10:47 AM
                    Intel RTM and HLE is a success (is it?)anonymou52021/03/31 10:40 AM
                Intel RTM and HLE is a success (is it?)Ganon2021/03/31 09:58 AM
                  Intel RTM and HLE is a success (is it?)anonymou52021/03/31 10:42 AM
                  Intel RTM and HLE is a success (is it?)Linus Torvalds2021/03/31 11:54 AM
                    Intel RTM and HLE is a success (is it?)Linus Torvalds2021/03/31 12:00 PM
                      Any idea about IBM?Mark Roulo2021/03/31 12:15 PM
                        Any idea about IBM?Linus Torvalds2021/03/31 12:37 PM
                          Any idea about IBM?dmcq2021/03/31 03:04 PM
                            Any idea about IBM?Linus Torvalds2021/03/31 04:44 PM
                              A non straw man view of hardware transactional memoryGanon2021/03/31 07:52 PM
                                A non straw man view of hardware transactional memoryanon22021/03/31 11:03 PM
                                  A non straw man view of hardware transactional memoryCarson2021/04/02 01:11 AM
                                    A non straw man view of hardware transactional memoryanon22021/04/02 05:28 AM
                              IBM zArch TM - guaranteed progressDavid Kanter2021/03/31 08:37 PM
                              Any idea about IBM?Andrey2021/03/31 11:31 PM
                                Any idea about IBM?Linus Torvalds2021/04/01 10:54 AM
                                  Any idea about IBM?Andrey2021/04/02 12:50 PM
                        Any idea about IBM?someone2021/04/01 12:02 AM
                      Intel RTM and HLE is a success (is it?)anon22021/03/31 03:46 PM
                        Intel RTM and HLE is a success (is it?)Linus Torvalds2021/03/31 05:08 PM
                          Leaving it to software is tricky!David Kanter2021/03/31 08:41 PM
                            And a hardware predicttor needs access to fallback timingCarson2021/04/01 11:13 PM
                              Hardware fallback pathAnon2021/04/02 10:51 AM
                              And a hardware predicttor needs access to fallback timingLinus Torvalds2021/04/03 10:41 AM
                                And a hardware predicttor needs access to fallback timingLinus Torvalds2021/04/03 11:11 AM
                                  And a hardware predicttor needs access to fallback timingsr2021/04/03 11:30 AM
                                    And a hardware predicttor needs access to fallback timingLinus Torvalds2021/04/03 12:14 PM
                                      And a hardware predicttor needs access to fallback timingsr2021/04/03 12:39 PM
                                        And a hardware predicttor needs access to fallback timingAnon2021/04/03 02:08 PM
                                          And a hardware predicttor needs access to fallback timingsr2021/04/03 02:33 PM
                                            And a hardware predicttor needs access to fallback timingdmcq2021/04/04 05:35 AM
                                        And a hardware predicttor needs access to fallback timingLinus Torvalds2021/04/03 02:22 PM
                                          Transactional memory isn't exclusive to lockingsr2021/04/04 12:17 AM
                                            Transactional memory isn't exclusive to lockingAspect of Anonimity2021/04/04 03:49 AM
                                            Transactional memory isn't exclusive to lockingAndrey2021/04/04 04:58 AM
                                              Transactional memory isn't exclusive to lockingsr2021/04/04 10:10 AM
                                                Transactional memory isn't exclusive to lockingAndrey2021/04/04 10:33 AM
                                                  Transactional memory isn't exclusive to lockingsr2021/04/05 02:41 AM
                                      And a hardware predicttor needs access to fallback timingGeertB2021/04/04 07:08 PM
                            Leaving it to software is tricky!Andrey2021/04/02 03:00 PM
                          Intel RTM and HLE is a success (is it?)@never_released2021/04/01 08:21 AM
                            Intel RTM and HLE is a success (is it?)@never_released2021/04/01 08:30 AM
                            Intel RTM and HLE is a success (is it?)Linus Torvalds2021/04/01 10:00 AM
                              Intel RTM and HLE is a success (is it?)dmcq2021/04/01 10:35 AM
                                Intel RTM and HLE is a success (is it?)Linus Torvalds2021/04/01 10:59 AM
                                  Is HTM actually in ARMv9?dncq2021/04/01 11:26 AM
                                    Is HTM actually in ARMv9?Linus Torvalds2021/04/01 12:13 PM
                              Intel RTM and HLE is a success (is it?)Jörn Engel2021/04/01 09:15 PM
                          Intel RTM and HLE is a success (is it?)---2021/04/02 10:00 AM
                      Intel RTM and HLE is a success (is it?)Jon Masters2021/04/01 10:56 AM
                        RockMichael S2021/04/01 12:29 PM
            Intel RTM and HLE is a success (is it?)Linus Torvalds2021/03/31 11:50 AM
              Intel RTM and HLE is a success (is it?)anon22021/03/31 03:57 PM
                Intel RTM and HLE is a success (is it?)anon32021/03/31 04:09 PM
          Intel RTM and HLE is a success (is it?)someone2021/03/31 11:56 PM
            Intel RTM and HLE is a success (is it?)someone2021/04/01 12:21 AM
              Intel RTM and HLE is a success (is it?)none2021/04/01 04:31 AM
                Intel RTM and HLE is a success (is it?)anonymou52021/04/01 09:24 AM
                  Intel RTM and HLE is a success (is it?)anony2021/04/01 10:26 AM
                  Intel RTM and HLE is a success (is it?)none2021/04/01 11:20 AM
                Intel RTM and HLE is a success (is it?)Brendan2021/04/01 05:23 PM
                  Intel RTM and HLE is a success (is it?)Adrian2021/04/02 01:03 AM
      Transactional memory similarity to garbage collectionPaul A. Clayton2021/04/05 01:53 PM
        Maybe notMark Roulo2021/04/05 03:07 PM
        Transactional memory similarity to garbage collectionAnon2021/04/05 03:14 PM
        No conflict between theory and practiceAspect of Anonimity2021/04/05 07:57 PM
          No conflict between theory and practiceNoSpammer2021/04/05 09:38 PM
            No conflict between theory and practicedmcq2021/04/06 07:05 AM
              No conflict between theory and practicesr2021/04/06 09:22 AM
                No conflict between theory and practiceBen LaHaise2021/04/06 03:43 PM
                  No conflict between theory and practicesr2021/04/07 09:42 AM
                    No conflict between theory and practiceAnon2021/04/07 10:06 AM
                    No conflict between theory and practiceLinus Torvalds2021/04/07 10:35 AM
                      No conflict between theory and practicesr2021/04/07 12:34 PM
                        If HTM were well implemented, nobody would complain (NT)Anon2021/04/07 12:54 PM
                          If HTM were well implemented, nobody would complaindmcq2021/04/07 03:36 PM
                            If HTM were well implemented, nobody would complainAnon2021/04/07 04:04 PM
                          If HTM were well implemented, nobody would complain---2021/04/08 09:37 AM
                        No conflict between theory and practiceLinus Torvalds2021/04/07 01:20 PM
                          No conflict between theory and practiceAndrey2021/04/07 02:32 PM
                            No conflict between theory and practicedmcq2021/04/07 03:32 PM
                              No conflict between theory and practiceanonymou52021/04/07 04:26 PM
                              No conflict between theory and practiceAndrey2021/04/07 05:54 PM
                                No conflict between theory and practiceLinus Torvalds2021/04/08 08:41 AM
                                  No conflict between theory and practiceAndrey2021/04/08 09:12 AM
                                  No conflict between theory and practiceRobert Williams2021/04/08 09:15 AM
                                    No conflict between theory and practiceLinus Torvalds2021/04/08 09:56 AM
                                      No conflict between theory and practiceRobert Williams2021/04/08 07:50 PM
                                        No conflict between theory and practiceLinus Torvalds2021/04/09 09:25 AM
                                          TSX for all?Robert Williams2021/04/09 12:46 PM
                                            It helps adoption when developers can run the code an their machines.Mark Roulo2021/04/09 12:54 PM
                                              It helps adoption when developers can run the code an their machines.me2021/04/09 02:21 PM
                                                It helps adoption when developers can run the code an their machines.Andrey2021/04/10 07:08 AM
                                                  It helps adoption when developers can run the code an their machines.me2021/04/10 12:43 PM
                                                    It helps adoption when developers can run the code an their machines.Robert Williams2021/04/10 07:05 PM
                                                      It helps adoption when developers can run the code an their machines.Andrey2021/04/11 01:42 AM
                                                        It helps adoption when developers can run the code an their machines.Michael S2021/04/11 04:23 AM
                                              It helps adoption when developers can run the code an their machines.Robert Williams2021/04/10 08:24 AM
                                                It helps adoption when developers can run the code an their machines.Andrey2021/04/10 10:36 AM
                                                  It helps adoption when developers can run the code an their machines.Michael S2021/04/10 10:58 AM
                                                    It helps adoption when developers can run the code an their machines.Robert Williams2021/04/10 11:42 AM
                                                    It helps adoption when developers can run the code an their machines.Brendan2021/04/10 11:27 PM
                                                      It helps adoption when developers can run the code an their machines.Michael S2021/04/11 03:34 AM
                                                        It helps adoption when developers can run the code an their machines.Brendan2021/04/11 04:18 PM
                                            TSX for all?wumpus2021/04/09 01:10 PM
                                            TSX for all?Linus Torvalds2021/04/09 03:03 PM
                                              [CLICK BAIT?] Torvalds to recommend arm64!anonymous22021/04/09 03:17 PM
                                              TSX for all?Linus Torvalds2021/04/09 03:22 PM
                                                amen! (NT)anonymou52021/04/09 05:42 PM
                                            TSX for all?Emil Briggs2021/04/10 05:52 AM
                                              TSX for all?Michael S2021/04/10 11:13 AM
                                                TSX for all?Brendan2021/04/10 11:05 PM
                                          No conflict between theory and practiceanonymouse2021/04/09 03:02 PM
                            No conflict between theory and practiceLinus Torvalds2021/04/07 05:12 PM
                              No conflict between theory and practiceAndrey2021/04/07 05:33 PM
                                No conflict between theory and practiceanon22021/04/07 08:12 PM
                                  No conflict between theory and practiceAndrey2021/04/08 12:29 AM
                                    No conflict between theory and practiceAnon2021/04/08 12:50 AM
                                    No conflict between theory and practiceanon22021/04/08 12:52 AM
                                      No conflict between theory and practiceAndrey2021/04/08 02:41 AM
                                        No conflict between theory and practiceAdrian2021/04/08 03:09 AM
                                          No conflict between theory and practiceAndrey2021/04/08 03:50 AM
                                            No conflict between theory and practiceAdrian2021/04/08 05:07 AM
                                            No conflict between theory and practiceanon22021/04/08 05:18 AM
                                              No conflict between theory and practiceAndrey2021/04/08 08:18 AM
                                                No conflict between theory and practiceanon22021/04/08 09:11 AM
                                                  No conflict between theory and practiceAndrey2021/04/08 09:48 AM
                                                    No conflict between theory and practiceanon22021/04/08 04:10 PM
                                            No conflict between theory and practice---2021/04/08 08:30 PM
                                              No conflict between theory and practicedmcq2021/04/09 02:25 AM
                                                No conflict between theory and practiceLinus Torvalds2021/04/09 09:44 AM
                                        No conflict between theory and practiceanon22021/04/08 04:09 AM
                                          No conflict between theory and practicesr2021/04/10 01:22 AM
                                            No conflict between theory and practiceAnon2021/04/10 04:00 AM
                                              No conflict between theory and practiceEtienne Lorrain2021/04/12 12:56 AM
                                                No conflict between theory and practiceAnon2021/04/12 01:54 AM
                                                  No conflict between theory and practicedmcq2021/04/12 01:44 PM
                              No conflict between theory and practiceGabriele Svelto2021/04/08 01:03 AM
                                No conflict between theory and practicedmcq2021/04/12 01:53 PM
                                  No conflict between theory and practicedmcq2021/04/14 04:50 AM
                      transactional memory = memory lock/unlocksr2021/04/10 12:56 AM
                        transactional memory = memory lock/unlockAnon2021/04/10 04:05 AM
                    No conflict between theory and practiceanon22021/04/07 05:19 PM
                No conflict between theory and practicedmcq2021/04/07 01:16 PM
            No conflict between theory and practiceAnon2021/04/06 10:46 AM
            No conflict between theory and practiceAspect of Anonimity2021/04/06 02:16 PM
              No conflict between theory and practiceNoSpammer2021/04/10 12:02 AM
                No conflict between theory and practicesr2021/04/10 01:47 AM
                No conflict between theory and practiceAspect of Anonimity2021/04/10 11:30 PM
                  No conflict between theory and practiceAndrey2021/04/11 05:05 AM
                    HmAspect of Anonimity2021/04/12 07:29 PM
                      Special internal SRAM to store Mutex?Etienne Lorrain2021/04/13 01:01 AM
                        Special internal SRAM to store Mutex?Anon2021/04/13 01:26 AM
                        Special internal SRAM to store Mutex?Linus Torvalds2021/04/13 09:53 AM
                        Special internal SRAM to store Mutex?Dan Fay2021/04/13 12:03 PM
                          Special internal SRAM to store Mutex?Ben LaHaise2021/04/13 03:32 PM
                          Special internal SRAM to store Mutex?Gabriele Svelto2021/04/13 11:43 PM
                        Special internal SRAM to store Mutex?Carson2021/04/13 10:19 PM
          No conflict between theory and practicesr2021/04/06 07:12 AM
            No conflict between theory and practiceAnon2021/04/06 10:43 AM
            No conflict between theory and practiceAspect of Anonimity2021/04/06 03:20 PM
              No conflict between theory and practicesr2021/04/07 10:09 AM
                No conflict between theory and practiceanon22021/04/07 08:53 PM
  Armv9 officially announceddmcq2021/03/30 03:28 PM
    Totally backwads logic on compatibilityHeikki Kultala2021/03/30 03:44 PM
      Totally backwads logic on compatibilityDoug S2021/03/31 12:09 PM
        Totally backwads logic on compatibilitydmcq2021/03/31 03:15 PM
  Armv9 officially announcedv92021/04/12 03:57 PM
    Armv9 officially announcedj2021/04/13 05:07 AM
      Armv9 officially announcedDoug S2021/04/13 10:21 AM
        Armv9 officially announcedanonymou52021/04/13 11:06 AM
          Armv9 officially announcedDoug S2021/04/13 01:01 PM
        Armv9 officially announceddmcq2021/04/13 03:20 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?