rseq: holy grail rwlock?

By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), August 21, 2018 4:10 pm
Room: Moderated Discussions
Travis (travis.downs.delete@this.gmail.com) on August 21, 2018 4:27 pm wrote:
>
> The unfairness that was problematic was a constant stream of readers
> starving out writers, or between various writers or what?

We've had that, yes (and in fact, we've had that even for the regular rwlocks).

It's really nice when it works, but it can get really outrageously bad when there is contention. And the unfairness can become visible in the cache coherency protocol itself, where a CPU that just owned a lock has a much easier time grabbing it again immediately, because it might still be exclusive in the caches.

There's no guarantee that the cache coherency is fair, after all.

So we've almost invariably had to build in fairness in the queuing itself. Our spinlocks, for example, are not just a "owner" lock. No, they are ticket locks, so that people that get blocked on a spinlock get a particular ordering, and you don't get in the situation that some CPU's have an easier time re-taking the lock than others.

And yes, that ended up slowing the spinlocks down and making them more complex, but not really noticeably so - and the unfairness case was really noticeable on some big machines. To the point where people had watchdogs fire because some CPU wouldn't make progress for tens of seconds at a time, just because other CPU's could re-take the lock so quickly.

So I used to detest fairness. It makes locking harder and slower. But almost every time, we've found that if something can get contended, fairness isn't just a good idea, it's pretty much required.

> A reasonable strategy would seem to be to make such locks favor writers as far as fairness goes: once
> a writer expresses interest, no new readers enter the critical section.

That ends up being potentially even worse, because quite often the reader is the critical case, and the writer has to wait for other readers to finish anyway, so a writer-favoring lock can be really bad and then cause hickups for the readers, in case there is some way an untrusted user can schedule writers.

So what we do (and I might mis-remember) is

- if it sees the "no writer" flag, it just increments the percpu reader count, and is done

which basically makes the "default" reader case optimal. This is all done non-preemptibly because of that percpu sequence, of course.

But whenever a writer shows up, the percpu fast case simply goes away, and we fall back to a fair rwlock. Obviously with some logic on the part of the writer to wait for the readers that came in before it marked itself (by adding up those percpu counts).

So the percpu rwlock basically ends up just being a n almost perfectly normal rwlock when writers are around, but has a special fast-case for when there are no writers and it can use pure percpu accounting of readers. And there's some percpu and RCU logic for some of the serialization issues between these two states.

I may have oversimplified and misstated it a bit, but it's close to something like that.

Linus
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
ARM turns to a god and a heroAM2018/08/16 08:32 AM
  ARM turns to a god and a heroMaynard Handley2018/08/16 08:41 AM
    ARM turns to a god and a heroDoug S2018/08/16 10:11 AM
    ARM turns to a god and a heroGeoff Langdale2018/08/16 10:59 PM
      ARM turns to a god and a herodmcq2018/08/17 04:12 AM
  ARM is somewhat misleadingAdrian2018/08/16 10:56 PM
    It's marketing materialGabriele Svelto2018/08/17 12:00 AM
      It's marketing materialMichael S2018/08/17 02:13 AM
        It's marketing materialdmcq2018/08/17 04:23 AM
          It's marketing materialAndrei Frumusanu2018/08/17 06:25 AM
        It's marketing materialLinus Torvalds2018/08/17 10:20 AM
          It's marketing materialGroo2018/08/17 12:44 PM
            It's marketing materialDoug S2018/08/17 01:14 PM
          promises and deliveriesAM2018/08/17 01:32 PM
            promises and deliveriesPassing Through2018/08/17 02:02 PM
              Just by way of clarification Passing Through2018/08/17 02:15 PM
                Just by way of clarification AM2018/08/18 11:49 AM
                  Just by way of clarification Passing Through2018/08/18 12:34 PM
                    This ain't the nineties any longerPassing Through2018/08/18 12:54 PM
                      This ain't the nineties any longerMaynard Handley2018/08/18 01:50 PM
                        This ain't the nineties any longerPassing Through2018/08/18 02:57 PM
                          This ain't the nineties any longerPassing Through2018/09/06 01:42 PM
                            This ain't the nineties any longerMaynard Handley2018/09/07 03:10 PM
                              This ain't the nineties any longerPassing Through2018/09/07 03:48 PM
                                This ain't the nineties any longerMaynard Handley2018/09/07 04:22 PM
                Just by way of clarification Wilco2018/08/18 12:26 PM
                  Just by way of clarification Passing Through2018/08/18 12:39 PM
                  Just by way of clarification none2018/08/18 09:52 PM
                    Just by way of clarification dmcq2018/08/19 07:32 AM
                      Just by way of clarification none2018/08/19 07:54 AM
                        Just by way of clarification dmcq2018/08/19 10:24 AM
                          Just by way of clarification none2018/08/19 10:52 AM
                  Just by way of clarification Gabriele Svelto2018/08/19 05:41 AM
                    Just by way of clarification Passing Through2018/08/19 08:25 AM
                      Whiteboards at Gatwick airport anyone? Passing Through2018/08/20 03:24 AM
          It's marketing materialMichael S2018/08/18 10:12 AM
          It's marketing materialBrett2018/08/18 04:22 PM
            It's marketing materialBrett2018/08/18 04:33 PM
              It's marketing materialAdrian2018/08/19 12:21 AM
        A76AM2018/08/17 01:45 PM
          A76Michael S2018/08/18 10:20 AM
            A76AM2018/08/18 11:39 AM
              A76Michael S2018/08/18 11:49 AM
                A76AM2018/08/18 12:06 PM
                  A76Doug S2018/08/18 12:43 PM
                    A76Maynard Handley2018/08/18 01:42 PM
                      A76Maynard Handley2018/08/18 03:22 PM
                        Why write zeros when one can use metadata?Paul A. Clayton2018/08/18 05:19 PM
                          Why write zeros when one can use metadata?Maynard Handley2018/08/19 10:12 AM
                            Dictionary compress might apply to memcopyPaul A. Clayton2018/08/19 12:45 PM
                        Instructions for zeroingKonrad Schwarz2018/08/30 05:37 AM
                          Instructions for zeroingMaynard Handley2018/08/30 07:41 AM
                          Instructions for zeroingAdrian2018/08/30 10:37 AM
                            dcbz -> dcbzl (was: Instructions for zeroing)hobold2018/08/31 12:50 AM
                              dcbz -> dcbzl (was: Instructions for zeroing)dmcq2018/09/01 04:28 AM
                      A76Travis2018/08/19 10:36 AM
                        A76Maynard Handley2018/08/19 11:22 AM
                          A76Travis2018/08/19 01:07 PM
                            A76Maynard Handley2018/08/19 05:24 PM
                        Remote atomicsmatthew2018/08/19 11:51 AM
                          Remote atomicsMichael S2018/08/19 12:58 PM
                            Remote atomicsmatthew2018/08/19 01:32 PM
                              Remote atomicsMichael S2018/08/19 01:36 PM
                                Remote atomicsmatthew2018/08/19 01:48 PM
                                  Remote atomicsMichael S2018/08/19 02:16 PM
                                    Remote atomicsRicardo B2018/08/20 09:05 AM
                            Remote atomicsdmcq2018/08/19 01:33 PM
                          Remote atomicsTravis2018/08/19 01:32 PM
                            Remote atomicsMichael S2018/08/19 01:46 PM
                              Remote atomicsTravis2018/08/19 04:35 PM
                                Remote atomicsMichael S2018/08/20 02:29 AM
                            Remote atomicsmatthew2018/08/19 06:58 PM
                              Remote atomicsanon2018/08/19 11:59 PM
                                Remote atomicsTravis2018/08/20 09:26 AM
                              Remote atomicsTravis2018/08/20 08:57 AM
                              Remote atomicsLinus Torvalds2018/08/20 03:29 PM
                                Fitting time slices to execution phasesPaul A. Clayton2018/08/21 08:09 AM
                                  Fitting time slices to execution phasesLinus Torvalds2018/08/21 01:34 PM
                                    Fitting time slices to execution phasesLinus Torvalds2018/08/21 02:31 PM
                                      Fitting time slices to execution phasesGabriele Svelto2018/08/21 02:54 PM
                                        Fitting time slices to execution phasesLinus Torvalds2018/08/21 03:26 PM
                                      Fitting time slices to execution phasesTravis2018/08/21 03:21 PM
                                        Fitting time slices to execution phasesLinus Torvalds2018/08/21 03:39 PM
                                          Fitting time slices to execution phasesTravis2018/08/21 03:59 PM
                                            Fitting time slices to execution phasesLinus Torvalds2018/08/21 04:13 PM
                                      Fitting time slices to execution phasesanon2018/08/21 03:27 PM
                                        Fitting time slices to execution phasesLinus Torvalds2018/08/21 05:02 PM
                                          Fitting time slices to execution phasesEtienne2018/08/22 01:28 AM
                                        Fitting time slices to execution phasesGabriele Svelto2018/08/22 02:07 PM
                                          Fitting time slices to execution phasesTravis2018/08/22 03:00 PM
                                          Fitting time slices to execution phasesanon2018/08/22 05:52 PM
                                    Fitting time slices to execution phasesTravis2018/08/21 03:37 PM
                                    Is preventing misuse that complex?Paul A. Clayton2018/08/23 04:42 AM
                                      Is preventing misuse that complex?Linus Torvalds2018/08/23 11:46 AM
                                        Is preventing misuse that complex?Travis2018/08/23 12:29 PM
                                          Is preventing misuse that complex?Travis2018/08/23 12:33 PM
                                            Is preventing misuse that complex?Jeff S.2018/08/24 06:57 AM
                                              Is preventing misuse that complex?Travis2018/08/24 07:47 AM
                                          Is preventing misuse that complex?Linus Torvalds2018/08/23 01:30 PM
                                            Is preventing misuse that complex?Travis2018/08/23 02:11 PM
                                              Is preventing misuse that complex?Linus Torvalds2018/08/24 12:00 PM
                                                Is preventing misuse that complex?Gabriele Svelto2018/08/24 12:25 PM
                                                  Is preventing misuse that complex?Linus Torvalds2018/08/24 12:33 PM
                                  Fitting time slices to execution phasesTravis2018/08/21 02:54 PM
                                rseq: holy grail rwlock?Travis2018/08/21 02:18 PM
                                  rseq: holy grail rwlock?Linus Torvalds2018/08/21 02:59 PM
                                    rseq: holy grail rwlock?Travis2018/08/21 03:27 PM
                                      rseq: holy grail rwlock?Linus Torvalds2018/08/21 04:10 PM
                                        rseq: holy grail rwlock?Travis2018/08/21 05:21 PM
                  ARM design housesMichael S2018/08/21 04:07 AM
                    ARM design housesWilco2018/08/22 11:38 AM
                      ARM design housesMichael S2018/08/22 01:21 PM
                        ARM design housesWilco2018/08/22 02:23 PM
                          ARM design housesMichael S2018/08/29 12:58 AM
                            Qualcomm's core naming scheme really, really sucksHeikki Kultala2018/08/29 01:19 AM
                A76Maynard Handley2018/08/18 01:07 PM
                  A76Michael S2018/08/18 01:32 PM
                    A76Maynard Handley2018/08/18 01:52 PM
                      A76Michael S2018/08/18 02:04 PM
    ARM is somewhat misleadingjuanrga2018/08/17 12:20 AM
    Surprised??Alberto2018/08/17 12:52 AM
      Surprised??Alberto2018/08/17 01:10 AM
      Surprised??none2018/08/17 01:46 AM
      Garbage talkAndrei Frumusanu2018/08/17 06:30 AM
        Garbage talkMichael S2018/08/17 06:43 AM
          Garbage talkAndrei Frumusanu2018/08/17 08:51 AM
            Garbage talkMichael S2018/08/18 10:29 AM
        Garbage talkAdrian2018/08/17 07:28 AM
          Garbage talkAlberto2018/08/17 08:20 AM
          Garbage talkAndrei Frumusanu2018/08/17 08:48 AM
            Garbage talkAdrian2018/08/17 09:17 AM
              Garbage talkAndrei Frumusanu2018/08/17 09:36 AM
                Garbage talkAdrian2018/08/17 01:53 PM
                  Garbage talkAndrei Frumusanu2018/08/17 11:17 PM
        More like a religion he?? ARM has an easy life :)Alberto2018/08/17 08:13 AM
          More like a religion he?? ARM has an easy life :)Andrei Frumusanu2018/08/17 08:34 AM
            More like a religion he?? ARM has an easy life :)Alberto2018/08/17 09:03 AM
              More like a religion he?? ARM has an easy life :)Andrei Frumusanu2018/08/17 09:43 AM
              More like a religion he?? ARM has an easy life :)Doug S2018/08/17 01:17 PM
              15W phone SoCsAM2018/08/17 02:04 PM
          More like a religion he?? ARM has an easy life :)Maynard Handley2018/08/17 11:29 AM
  my future stuff will be better than your old stuff, hey I'm a god at last (NT)Eric Bron2018/08/18 02:34 AM
    my future stuff will be better than your old stuff, hey I'm a god at lastnone2018/08/18 07:34 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell green?