Fitting time slices to execution phases

By: Paul A. Clayton (paaronclayton.delete@this.gmail.com), August 21, 2018 9:09 am
Room: Moderated Discussions
Linus Torvalds (torvalds.delete@this.linux-foundation.org) on August 20, 2018 4:29 pm wrote:
> matthew (nobody.delete@this.example.com) on August 19, 2018 7:58 pm wrote:
> >
> > Solaris had the ability for a thread to tell the kernel it was holding a mutex and that it shouldn't
> > be preempted until it had dropped the lock. There have been several attempts to add something
> > like that to Linux, but none have succeeded yet. Nothing with hardware assists either.
>
> Linux now (merged into the latest released kernel version, 4.18) actually has
> what could be seen as the reverse of that: "rseq" aka restartable sequences.
>
> It doesn't disable preemption (which is crazy and all kinds of stupid), but it does
> allow user space to see if it has been preempted, and mark certain sequences to
> be done atomically. And if preemption happens, the sequence gets aborted.

Since threads have phases that benefit from not being significantly interrupted, I think there would be value to allowing a thread to express that a phase would extend beyond the normally allotted time slice (and to be able to end a time slice early while still informing the scheduler that more work is available, i.e., that the point is a convenient stopping place).

Critical sections guarded by locks are especially important phases because they hinder forward progress by other threads. Furthermore such tend to be short, so requesting an additional sub-millisecond of execution time would not seem to be especially disruptive to responsiveness.

(Rather than skipping a preemption, a thread might query how many cycles it has in its execution allocation before entering the critical section and yield with a request for continuation, possibly giving a length in the request, if the allocation is insufficient for desired likelihood of completion of a phase. Hardware support seems likely to be helpful in this, at least some means to cheaply determine approximately how much execution time a thread has remaining; hardware support might also make such less helpful for timing side channels.)

Phases involving cache (or branch predictor) warm-up would typically be longer and have less impact on performance, but phase information might still be useful.

This has obvious similarities to core/cache/memory node affinity. The request is more a hint than a directive, but some workloads might highly value certain kinds of affinity such that they are not worth running without such. (This also relates to real time requirements.)

(One could also argue for communicating the degree of tolerance of descheduling. L3 warm-up would be more tolerant of moderate duration interference from other threads than L1 warm-up. Some locks would tolerate more delay in release than others; some delay tolerance is dynamic and some static, so communication of intent/expectation may be a better hint than a request for a specific behavior. A market for resources and warranties (a warranty being like an inverse bid, failure to get the resource as contracted earns credit potentially much greater than the purchase price) could handle a variety of resources and requirements.)

> So you can think of rseq as kind of like the OS equivalent of transactional memory, but instead
> of the transactional sequence being aborted on a cache conflict, it gets aborted on preemption.

Except that the abort happens after the preemption completes. If a thread gets a lock and starts work in a critical section, it cannot release the lock (and undo no longer appropriate work) when another thread is waiting on the lock.

(Some of the transactional memory proposals suggest NAKing conflicting remote requests or using versioned memory to give a transaction a larger window in which to complete, which is similar to extending the time slice.)

> That allows you to do certain per-cpu things in user space (as opposed to per-thread).
>
> And that, in turn, can be a big deal when you have 4 cores, but 4 million threads.
> You don't want to have the memory overhead of per-thread allocations, when
> all you really wanted was the cache advantages of per-cpu counters.

Per-cpu or per "concurrent" thread? (Hardware multithreading does not have to be presented as virtual processors.) I am assuming the latter (for atomicity guarantees). For in-program local atomicity, atomicity failure on interruption may be excessively conservative. Tracking by "atomicity thread group" might not be excessively complex (though it might not have much advantage since interrupts are relatively rare).

If hardware supported faster local atomics (which is problematic for x86 since the LOCK prefix is global in memory scope [and stronger than normal consistency]), cache affinity might have further application. I doubt there would be much use for a non-concurrency check at L2 cache level (e.g., if eight software thread groups only interfere within a group and as long as no two hardware threads within a group are scheduled concurrently, "local" accesses are "interrupt atomic"), but such might be a path worth some thought to discover unexpected opportunities in more ordinary uses.

> It's a pretty limited use-case, and I don't expect normal users to really ever
> see it. But it is designed to allow for things like per-cpu malloc libraries etc,
> and a few other very specific situations where you can take advantage of it.

(Normal application programmers presumably do not really ever see system calls but rather higher level abstractions.)

> We'll see if people end up taking advantage of it. The downside with a lot of clever interfaces is that because
> they are non-standard, you really don't see people using them unless there is a big win or unless you can transparently
> hide them in a library with absolutely zero downside from the portable standard approach.
>
> Which is basically what seems to have killed transactional memory. The library approach (HLE) ends up performing
> horribly in many real-life situations, so it's not possible to use as a direct transparent replacement,
> and the full transactional model is so non-portable that it's not worth spending effort on.

I think transactional memory can be presented in a way that is useful enough and transparent/portable enough to have sufficiently significant (worthwhile) and broad (debugging/optimizing) adoption, but I have not given such the extended consideration it requires to work out a reasonable interface. Some changes would probably have to be made to programming interfaces (though some intention can sometimes be discovered by the compiler from "ordinary" source code); ideally such changes would make programming easier and less dangerous.

For HLE to provide behavior similar to general transactional memory, it seems that hardware would have to lie about the returned state of a lock. I.e., a lock that is not actually held (which other threads are "holding" transactionally) would present a locked value, so the thread could choose to spin, pause, or do other work. Intel's HLE makes such more difficult; the hardware can look at the post lock acquire test and cause it to fail, but if the lock contains other information hardware could not easily fake this information.

Even if HLE never had significantly worse performance than a lock and generally better performance, adoption and optimization would be artificially constrained. Part of the point of Intel's HLE is to encourage portability (code will work on any x86) and low-effort adoption. This is a good initial goal, but one also wants an interface and implementation/expectation that encourages optimization. There is not even a platform-level "guarantee" of behavior, so optimization is constrained.

(This may be somewhat similar to the indexing (and associativity and replacement policies) of caches. Modulo a power of two indexing encourages certain coding practices which may be unnecessary or even harmful under hardware that has different conflict behavior. If skewed associativity, quasi-prime modulo indexing, or other conflict changing cache implementation was provided sometimes without even a firm indication of persistence even within a platform, changing software would only be justifiable for short term gains.)

> We'll see if rseq can do better.

It seems it already has significant adoption for memory allocation.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
ARM turns to a god and a heroAM2018/08/16 09:32 AM
  ARM turns to a god and a heroMaynard Handley2018/08/16 09:41 AM
    ARM turns to a god and a heroDoug S2018/08/16 11:11 AM
    ARM turns to a god and a heroGeoff Langdale2018/08/16 11:59 PM
      ARM turns to a god and a herodmcq2018/08/17 05:12 AM
  ARM is somewhat misleadingAdrian2018/08/16 11:56 PM
    It's marketing materialGabriele Svelto2018/08/17 01:00 AM
      It's marketing materialMichael S2018/08/17 03:13 AM
        It's marketing materialdmcq2018/08/17 05:23 AM
          It's marketing materialAndrei Frumusanu2018/08/17 07:25 AM
        It's marketing materialLinus Torvalds2018/08/17 11:20 AM
          It's marketing materialGroo2018/08/17 01:44 PM
            It's marketing materialDoug S2018/08/17 02:14 PM
          promises and deliveriesAM2018/08/17 02:32 PM
            promises and deliveriesPassing Through2018/08/17 03:02 PM
              Just by way of clarification Passing Through2018/08/17 03:15 PM
                Just by way of clarification AM2018/08/18 12:49 PM
                  Just by way of clarification Passing Through2018/08/18 01:34 PM
                    This ain't the nineties any longerPassing Through2018/08/18 01:54 PM
                      This ain't the nineties any longerMaynard Handley2018/08/18 02:50 PM
                        This ain't the nineties any longerPassing Through2018/08/18 03:57 PM
                          This ain't the nineties any longerPassing Through2018/09/06 02:42 PM
                            This ain't the nineties any longerMaynard Handley2018/09/07 04:10 PM
                              This ain't the nineties any longerPassing Through2018/09/07 04:48 PM
                                This ain't the nineties any longerMaynard Handley2018/09/07 05:22 PM
                Just by way of clarification Wilco2018/08/18 01:26 PM
                  Just by way of clarification Passing Through2018/08/18 01:39 PM
                  Just by way of clarification none2018/08/18 10:52 PM
                    Just by way of clarification dmcq2018/08/19 08:32 AM
                      Just by way of clarification none2018/08/19 08:54 AM
                        Just by way of clarification dmcq2018/08/19 11:24 AM
                          Just by way of clarification none2018/08/19 11:52 AM
                  Just by way of clarification Gabriele Svelto2018/08/19 06:41 AM
                    Just by way of clarification Passing Through2018/08/19 09:25 AM
                      Whiteboards at Gatwick airport anyone? Passing Through2018/08/20 04:24 AM
          It's marketing materialMichael S2018/08/18 11:12 AM
          It's marketing materialBrett2018/08/18 05:22 PM
            It's marketing materialBrett2018/08/18 05:33 PM
              It's marketing materialAdrian2018/08/19 01:21 AM
        A76AM2018/08/17 02:45 PM
          A76Michael S2018/08/18 11:20 AM
            A76AM2018/08/18 12:39 PM
              A76Michael S2018/08/18 12:49 PM
                A76AM2018/08/18 01:06 PM
                  A76Doug S2018/08/18 01:43 PM
                    A76Maynard Handley2018/08/18 02:42 PM
                      A76Maynard Handley2018/08/18 04:22 PM
                        Why write zeros when one can use metadata?Paul A. Clayton2018/08/18 06:19 PM
                          Why write zeros when one can use metadata?Maynard Handley2018/08/19 11:12 AM
                            Dictionary compress might apply to memcopyPaul A. Clayton2018/08/19 01:45 PM
                        Instructions for zeroingKonrad Schwarz2018/08/30 06:37 AM
                          Instructions for zeroingMaynard Handley2018/08/30 08:41 AM
                          Instructions for zeroingAdrian2018/08/30 11:37 AM
                            dcbz -> dcbzl (was: Instructions for zeroing)hobold2018/08/31 01:50 AM
                              dcbz -> dcbzl (was: Instructions for zeroing)dmcq2018/09/01 05:28 AM
                      A76Travis2018/08/19 11:36 AM
                        A76Maynard Handley2018/08/19 12:22 PM
                          A76Travis2018/08/19 02:07 PM
                            A76Maynard Handley2018/08/19 06:24 PM
                        Remote atomicsmatthew2018/08/19 12:51 PM
                          Remote atomicsMichael S2018/08/19 01:58 PM
                            Remote atomicsmatthew2018/08/19 02:32 PM
                              Remote atomicsMichael S2018/08/19 02:36 PM
                                Remote atomicsmatthew2018/08/19 02:48 PM
                                  Remote atomicsMichael S2018/08/19 03:16 PM
                                    Remote atomicsRicardo B2018/08/20 10:05 AM
                            Remote atomicsdmcq2018/08/19 02:33 PM
                          Remote atomicsTravis2018/08/19 02:32 PM
                            Remote atomicsMichael S2018/08/19 02:46 PM
                              Remote atomicsTravis2018/08/19 05:35 PM
                                Remote atomicsMichael S2018/08/20 03:29 AM
                            Remote atomicsmatthew2018/08/19 07:58 PM
                              Remote atomicsanon2018/08/20 12:59 AM
                                Remote atomicsTravis2018/08/20 10:26 AM
                              Remote atomicsTravis2018/08/20 09:57 AM
                              Remote atomicsLinus Torvalds2018/08/20 04:29 PM
                                Fitting time slices to execution phasesPaul A. Clayton2018/08/21 09:09 AM
                                  Fitting time slices to execution phasesLinus Torvalds2018/08/21 02:34 PM
                                    Fitting time slices to execution phasesLinus Torvalds2018/08/21 03:31 PM
                                      Fitting time slices to execution phasesGabriele Svelto2018/08/21 03:54 PM
                                        Fitting time slices to execution phasesLinus Torvalds2018/08/21 04:26 PM
                                      Fitting time slices to execution phasesTravis2018/08/21 04:21 PM
                                        Fitting time slices to execution phasesLinus Torvalds2018/08/21 04:39 PM
                                          Fitting time slices to execution phasesTravis2018/08/21 04:59 PM
                                            Fitting time slices to execution phasesLinus Torvalds2018/08/21 05:13 PM
                                      Fitting time slices to execution phasesanon2018/08/21 04:27 PM
                                        Fitting time slices to execution phasesLinus Torvalds2018/08/21 06:02 PM
                                          Fitting time slices to execution phasesEtienne2018/08/22 02:28 AM
                                        Fitting time slices to execution phasesGabriele Svelto2018/08/22 03:07 PM
                                          Fitting time slices to execution phasesTravis2018/08/22 04:00 PM
                                          Fitting time slices to execution phasesanon2018/08/22 06:52 PM
                                    Fitting time slices to execution phasesTravis2018/08/21 04:37 PM
                                    Is preventing misuse that complex?Paul A. Clayton2018/08/23 05:42 AM
                                      Is preventing misuse that complex?Linus Torvalds2018/08/23 12:46 PM
                                        Is preventing misuse that complex?Travis2018/08/23 01:29 PM
                                          Is preventing misuse that complex?Travis2018/08/23 01:33 PM
                                            Is preventing misuse that complex?Jeff S.2018/08/24 07:57 AM
                                              Is preventing misuse that complex?Travis2018/08/24 08:47 AM
                                          Is preventing misuse that complex?Linus Torvalds2018/08/23 02:30 PM
                                            Is preventing misuse that complex?Travis2018/08/23 03:11 PM
                                              Is preventing misuse that complex?Linus Torvalds2018/08/24 01:00 PM
                                                Is preventing misuse that complex?Gabriele Svelto2018/08/24 01:25 PM
                                                  Is preventing misuse that complex?Linus Torvalds2018/08/24 01:33 PM
                                  Fitting time slices to execution phasesTravis2018/08/21 03:54 PM
                                rseq: holy grail rwlock?Travis2018/08/21 03:18 PM
                                  rseq: holy grail rwlock?Linus Torvalds2018/08/21 03:59 PM
                                    rseq: holy grail rwlock?Travis2018/08/21 04:27 PM
                                      rseq: holy grail rwlock?Linus Torvalds2018/08/21 05:10 PM
                                        rseq: holy grail rwlock?Travis2018/08/21 06:21 PM
                  ARM design housesMichael S2018/08/21 05:07 AM
                    ARM design housesWilco2018/08/22 12:38 PM
                      ARM design housesMichael S2018/08/22 02:21 PM
                        ARM design housesWilco2018/08/22 03:23 PM
                          ARM design housesMichael S2018/08/29 01:58 AM
                            Qualcomm's core naming scheme really, really sucksHeikki Kultala2018/08/29 02:19 AM
                A76Maynard Handley2018/08/18 02:07 PM
                  A76Michael S2018/08/18 02:32 PM
                    A76Maynard Handley2018/08/18 02:52 PM
                      A76Michael S2018/08/18 03:04 PM
    ARM is somewhat misleadingjuanrga2018/08/17 01:20 AM
    Surprised??Alberto2018/08/17 01:52 AM
      Surprised??Alberto2018/08/17 02:10 AM
      Surprised??none2018/08/17 02:46 AM
      Garbage talkAndrei Frumusanu2018/08/17 07:30 AM
        Garbage talkMichael S2018/08/17 07:43 AM
          Garbage talkAndrei Frumusanu2018/08/17 09:51 AM
            Garbage talkMichael S2018/08/18 11:29 AM
        Garbage talkAdrian2018/08/17 08:28 AM
          Garbage talkAlberto2018/08/17 09:20 AM
          Garbage talkAndrei Frumusanu2018/08/17 09:48 AM
            Garbage talkAdrian2018/08/17 10:17 AM
              Garbage talkAndrei Frumusanu2018/08/17 10:36 AM
                Garbage talkAdrian2018/08/17 02:53 PM
                  Garbage talkAndrei Frumusanu2018/08/18 12:17 AM
        More like a religion he?? ARM has an easy life :)Alberto2018/08/17 09:13 AM
          More like a religion he?? ARM has an easy life :)Andrei Frumusanu2018/08/17 09:34 AM
            More like a religion he?? ARM has an easy life :)Alberto2018/08/17 10:03 AM
              More like a religion he?? ARM has an easy life :)Andrei Frumusanu2018/08/17 10:43 AM
              More like a religion he?? ARM has an easy life :)Doug S2018/08/17 02:17 PM
              15W phone SoCsAM2018/08/17 03:04 PM
          More like a religion he?? ARM has an easy life :)Maynard Handley2018/08/17 12:29 PM
  my future stuff will be better than your old stuff, hey I'm a god at last (NT)Eric Bron2018/08/18 03:34 AM
    my future stuff will be better than your old stuff, hey I'm a god at lastnone2018/08/18 08:34 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?