dcbz -> dcbzl (was: Instructions for zeroing)

By: hobold (hobold.delete@this.vectorizer.org), August 31, 2018 1:50 am
Room: Moderated Discussions
Adrian (a.delete@this.acm.org) on August 30, 2018 11:37 am wrote:
> Konrad Schwarz (no.spam.delete@this.no.spam) on August 30, 2018 6:37 am wrote:
> > Maynard Handley (name99.delete@this.name99.org) on August 18, 2018 4:22 pm wrote:
> > > - you offer an atomic remote zero (or remote fill) instruction that is given a line address
> > > plus count. The first thing it does is flush the line from all other caches, then it marks
> > > the set of lines locally as in some sort of intermediate state, then, fast as possible, it
> > > starts zeroing each line and, once zeroed, it goes back to a standard "valid" state.
> >
> > Power(PC) Data Cache Block Allocate (dcba) and Data Cache Block
> > Zero (dcbz) instructions do this, albeit on a single cache line.
>
>
> Now also AMD Zen has introduced CLZERO (i.e. the equivalent of dcbz).
>

For Apple, dcbz turned into a liability when they moved to 64 bits. It wasn't the 64 bit ISA that was at fault, but the cache line size of the 64 bit capable CPU model(s) differed. All prior 32 bit PPC processors had used 32 byte sized cache blocks, but the new one used 128 bytes.

The dcbz instruction was architected to clear a cache block at whatever native size the machine had, but programmers everywhere simply assumed the instruction to clear 32 bytes.

Thus the new 64 bit machine re-defined the instruction to mean "clear 32 bytes", and it was rather slow and inefficient. Additionally, a new instruction dcbzl was introduced to _really_ mean "clear a cache line of native size".

I guess a future extension would have had to redefine the meaning of dcbzl to "clear 128 bytes".


Lesson learned: to safely deploy a user-land instruction which manipulates a cache block, you have to pair it with a user-land instruction to obtain the current actual block size. And you have to have varying hardware models concurrently out in the field widely available for testing.

Otherwise programmers will use the instruction not for what you intended it to do, but for whatever they want.

Chances are that CLZERO will trigger another wave of painful learning.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
ARM turns to a god and a heroAM2018/08/16 09:32 AM
  ARM turns to a god and a heroMaynard Handley2018/08/16 09:41 AM
    ARM turns to a god and a heroDoug S2018/08/16 11:11 AM
    ARM turns to a god and a heroGeoff Langdale2018/08/16 11:59 PM
      ARM turns to a god and a herodmcq2018/08/17 05:12 AM
  ARM is somewhat misleadingAdrian2018/08/16 11:56 PM
    It's marketing materialGabriele Svelto2018/08/17 01:00 AM
      It's marketing materialMichael S2018/08/17 03:13 AM
        It's marketing materialdmcq2018/08/17 05:23 AM
          It's marketing materialAndrei Frumusanu2018/08/17 07:25 AM
        It's marketing materialLinus Torvalds2018/08/17 11:20 AM
          It's marketing materialGroo2018/08/17 01:44 PM
            It's marketing materialDoug S2018/08/17 02:14 PM
          promises and deliveriesAM2018/08/17 02:32 PM
            promises and deliveriesPassing Through2018/08/17 03:02 PM
              Just by way of clarification Passing Through2018/08/17 03:15 PM
                Just by way of clarification AM2018/08/18 12:49 PM
                  Just by way of clarification Passing Through2018/08/18 01:34 PM
                    This ain't the nineties any longerPassing Through2018/08/18 01:54 PM
                      This ain't the nineties any longerMaynard Handley2018/08/18 02:50 PM
                        This ain't the nineties any longerPassing Through2018/08/18 03:57 PM
                          This ain't the nineties any longerPassing Through2018/09/06 02:42 PM
                            This ain't the nineties any longerMaynard Handley2018/09/07 04:10 PM
                              This ain't the nineties any longerPassing Through2018/09/07 04:48 PM
                                This ain't the nineties any longerMaynard Handley2018/09/07 05:22 PM
                Just by way of clarification Wilco2018/08/18 01:26 PM
                  Just by way of clarification Passing Through2018/08/18 01:39 PM
                  Just by way of clarification none2018/08/18 10:52 PM
                    Just by way of clarification dmcq2018/08/19 08:32 AM
                      Just by way of clarification none2018/08/19 08:54 AM
                        Just by way of clarification dmcq2018/08/19 11:24 AM
                          Just by way of clarification none2018/08/19 11:52 AM
                  Just by way of clarification Gabriele Svelto2018/08/19 06:41 AM
                    Just by way of clarification Passing Through2018/08/19 09:25 AM
                      Whiteboards at Gatwick airport anyone? Passing Through2018/08/20 04:24 AM
          It's marketing materialMichael S2018/08/18 11:12 AM
          It's marketing materialBrett2018/08/18 05:22 PM
            It's marketing materialBrett2018/08/18 05:33 PM
              It's marketing materialAdrian2018/08/19 01:21 AM
        A76AM2018/08/17 02:45 PM
          A76Michael S2018/08/18 11:20 AM
            A76AM2018/08/18 12:39 PM
              A76Michael S2018/08/18 12:49 PM
                A76AM2018/08/18 01:06 PM
                  A76Doug S2018/08/18 01:43 PM
                    A76Maynard Handley2018/08/18 02:42 PM
                      A76Maynard Handley2018/08/18 04:22 PM
                        Why write zeros when one can use metadata?Paul A. Clayton2018/08/18 06:19 PM
                          Why write zeros when one can use metadata?Maynard Handley2018/08/19 11:12 AM
                            Dictionary compress might apply to memcopyPaul A. Clayton2018/08/19 01:45 PM
                        Instructions for zeroingKonrad Schwarz2018/08/30 06:37 AM
                          Instructions for zeroingMaynard Handley2018/08/30 08:41 AM
                          Instructions for zeroingAdrian2018/08/30 11:37 AM
                            dcbz -> dcbzl (was: Instructions for zeroing)hobold2018/08/31 01:50 AM
                              dcbz -> dcbzl (was: Instructions for zeroing)dmcq2018/09/01 05:28 AM
                      A76Travis2018/08/19 11:36 AM
                        A76Maynard Handley2018/08/19 12:22 PM
                          A76Travis2018/08/19 02:07 PM
                            A76Maynard Handley2018/08/19 06:24 PM
                        Remote atomicsmatthew2018/08/19 12:51 PM
                          Remote atomicsMichael S2018/08/19 01:58 PM
                            Remote atomicsmatthew2018/08/19 02:32 PM
                              Remote atomicsMichael S2018/08/19 02:36 PM
                                Remote atomicsmatthew2018/08/19 02:48 PM
                                  Remote atomicsMichael S2018/08/19 03:16 PM
                                    Remote atomicsRicardo B2018/08/20 10:05 AM
                            Remote atomicsdmcq2018/08/19 02:33 PM
                          Remote atomicsTravis2018/08/19 02:32 PM
                            Remote atomicsMichael S2018/08/19 02:46 PM
                              Remote atomicsTravis2018/08/19 05:35 PM
                                Remote atomicsMichael S2018/08/20 03:29 AM
                            Remote atomicsmatthew2018/08/19 07:58 PM
                              Remote atomicsanon2018/08/20 12:59 AM
                                Remote atomicsTravis2018/08/20 10:26 AM
                              Remote atomicsTravis2018/08/20 09:57 AM
                              Remote atomicsLinus Torvalds2018/08/20 04:29 PM
                                Fitting time slices to execution phasesPaul A. Clayton2018/08/21 09:09 AM
                                  Fitting time slices to execution phasesLinus Torvalds2018/08/21 02:34 PM
                                    Fitting time slices to execution phasesLinus Torvalds2018/08/21 03:31 PM
                                      Fitting time slices to execution phasesGabriele Svelto2018/08/21 03:54 PM
                                        Fitting time slices to execution phasesLinus Torvalds2018/08/21 04:26 PM
                                      Fitting time slices to execution phasesTravis2018/08/21 04:21 PM
                                        Fitting time slices to execution phasesLinus Torvalds2018/08/21 04:39 PM
                                          Fitting time slices to execution phasesTravis2018/08/21 04:59 PM
                                            Fitting time slices to execution phasesLinus Torvalds2018/08/21 05:13 PM
                                      Fitting time slices to execution phasesanon2018/08/21 04:27 PM
                                        Fitting time slices to execution phasesLinus Torvalds2018/08/21 06:02 PM
                                          Fitting time slices to execution phasesEtienne2018/08/22 02:28 AM
                                        Fitting time slices to execution phasesGabriele Svelto2018/08/22 03:07 PM
                                          Fitting time slices to execution phasesTravis2018/08/22 04:00 PM
                                          Fitting time slices to execution phasesanon2018/08/22 06:52 PM
                                    Fitting time slices to execution phasesTravis2018/08/21 04:37 PM
                                    Is preventing misuse that complex?Paul A. Clayton2018/08/23 05:42 AM
                                      Is preventing misuse that complex?Linus Torvalds2018/08/23 12:46 PM
                                        Is preventing misuse that complex?Travis2018/08/23 01:29 PM
                                          Is preventing misuse that complex?Travis2018/08/23 01:33 PM
                                            Is preventing misuse that complex?Jeff S.2018/08/24 07:57 AM
                                              Is preventing misuse that complex?Travis2018/08/24 08:47 AM
                                          Is preventing misuse that complex?Linus Torvalds2018/08/23 02:30 PM
                                            Is preventing misuse that complex?Travis2018/08/23 03:11 PM
                                              Is preventing misuse that complex?Linus Torvalds2018/08/24 01:00 PM
                                                Is preventing misuse that complex?Gabriele Svelto2018/08/24 01:25 PM
                                                  Is preventing misuse that complex?Linus Torvalds2018/08/24 01:33 PM
                                  Fitting time slices to execution phasesTravis2018/08/21 03:54 PM
                                rseq: holy grail rwlock?Travis2018/08/21 03:18 PM
                                  rseq: holy grail rwlock?Linus Torvalds2018/08/21 03:59 PM
                                    rseq: holy grail rwlock?Travis2018/08/21 04:27 PM
                                      rseq: holy grail rwlock?Linus Torvalds2018/08/21 05:10 PM
                                        rseq: holy grail rwlock?Travis2018/08/21 06:21 PM
                  ARM design housesMichael S2018/08/21 05:07 AM
                    ARM design housesWilco2018/08/22 12:38 PM
                      ARM design housesMichael S2018/08/22 02:21 PM
                        ARM design housesWilco2018/08/22 03:23 PM
                          ARM design housesMichael S2018/08/29 01:58 AM
                            Qualcomm's core naming scheme really, really sucksHeikki Kultala2018/08/29 02:19 AM
                A76Maynard Handley2018/08/18 02:07 PM
                  A76Michael S2018/08/18 02:32 PM
                    A76Maynard Handley2018/08/18 02:52 PM
                      A76Michael S2018/08/18 03:04 PM
    ARM is somewhat misleadingjuanrga2018/08/17 01:20 AM
    Surprised??Alberto2018/08/17 01:52 AM
      Surprised??Alberto2018/08/17 02:10 AM
      Surprised??none2018/08/17 02:46 AM
      Garbage talkAndrei Frumusanu2018/08/17 07:30 AM
        Garbage talkMichael S2018/08/17 07:43 AM
          Garbage talkAndrei Frumusanu2018/08/17 09:51 AM
            Garbage talkMichael S2018/08/18 11:29 AM
        Garbage talkAdrian2018/08/17 08:28 AM
          Garbage talkAlberto2018/08/17 09:20 AM
          Garbage talkAndrei Frumusanu2018/08/17 09:48 AM
            Garbage talkAdrian2018/08/17 10:17 AM
              Garbage talkAndrei Frumusanu2018/08/17 10:36 AM
                Garbage talkAdrian2018/08/17 02:53 PM
                  Garbage talkAndrei Frumusanu2018/08/18 12:17 AM
        More like a religion he?? ARM has an easy life :)Alberto2018/08/17 09:13 AM
          More like a religion he?? ARM has an easy life :)Andrei Frumusanu2018/08/17 09:34 AM
            More like a religion he?? ARM has an easy life :)Alberto2018/08/17 10:03 AM
              More like a religion he?? ARM has an easy life :)Andrei Frumusanu2018/08/17 10:43 AM
              More like a religion he?? ARM has an easy life :)Doug S2018/08/17 02:17 PM
              15W phone SoCsAM2018/08/17 03:04 PM
          More like a religion he?? ARM has an easy life :)Maynard Handley2018/08/17 12:29 PM
  my future stuff will be better than your old stuff, hey I'm a god at last (NT)Eric Bron2018/08/18 03:34 AM
    my future stuff will be better than your old stuff, hey I'm a god at lastnone2018/08/18 08:34 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?