Enough with the idiocy ... let's have proper ECC again.

By: Maynard Handley (name99.delete@this.name99.org), December 17, 2020 5:55 pm
Room: Moderated Discussions
David Kanter (dkanter.delete@this.realworldtech.com) on December 17, 2020 12:04 pm wrote:
> Maynard Handley (name99.delete@this.name99.org) on December 17, 2020 9:42 am wrote:
> > Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on December 17, 2020 9:26 am wrote:
> > > Maynard Handley (name99.delete@this.name99.org) on December 17, 2020 9:09 am wrote:
> > > > Björn Ragnar Björnsson (bjorn.ragnar.delete@this.gmail.com) on December 16, 2020 10:09 pm wrote:
> > > > > Adrian (a.delete@this.acm.org) on December 16, 2020 7:31 am wrote:
> > > > > > Gabriele Svelto (gabriele.svelto.delete@this.gmail.com) on December 15, 2020 3:24 pm wrote:
> > > > > > > It seems like Intel has added support for what they call in-band ECC to their recent Atom SoCs, see the
> > > > > > > mention here as well as here. There isn't much in the way of details on Intel pages apart from the
> > > > > > > fact that the mechanism can correct single-bit errors in
> > > > > > > non-ECC memory (presumably by reducing its effective
> > > > > > > size). However a Google search turned out this patent. All-in-all a very welcome development.
> > > > > >
> > > > > >
> > > > > > The support for In-Band ECC also exists in Tiger Lake U, but it is disabled in almost all SKUs,
> > > > > > including in most of the "Embedded" SKUs, where I would have expected it to be enabled.
> > > > > >
> > > > > > It is enabled only in the Tiger Lake U "Embedded" SKUs for the Extended Temperature Range.
> > > > > >
> > > > > > In-Band ECC allows the use of ECC with the LPDDR4x memories, but this advantage is paid
> > > > > > by a slight reduction in memory capacity and by a reduction in speed that is difficult to
> > > > > > quantify, because in most cases the extra accesses for ECC can be cached (the worst but
> > > > > > very seldom case is to do twice as many memory accesses, both for data and for ECC).
> > > > > >
> > > > > >
> > > > > > Intel has a patent application for it
> > > > >
> > > > > The padawans here may not know that back in the day, even after they were born, it was
> > > > > as simple as pie to order simms and dimms with parity/ecc at little additional cost.
> > > > >
> > > > > Sure, if you didn't care about ECC you could shave off a few percentage points on
> > > > > the price of memory. Big deal, but not for everybody, I'd always go for the slightly
> > > > > more expensive option (at least I could use it in system that supported it).
> > > > >
> > > > > Then something happened which changed the DRAM landscape forever (hopefully not forever though).
> > > > > What happened? Intel started producing processors that had no, absolutely no, nada, ability to implement
> > > > > ECC. These CPUs have for the last two decades been by far the bulk of processors sold. Could you
> > > > > have undetected memory errors? Yes, it's a near certainty. Could these have had serious consequences?
> > > > > Hard to say, are you working on anything that could have serious consequences?
> > > > >
> > > > > At a guess, an ECC capable DIMM costs 5-9% more to get to an end user than one that is
> > > > > without ECC. I'm willing to pay more than that for such a "fancy" feature, but oh-no, it's
> > > > > nearly impossible to source un-buffered ECC ram at reasonable speeds and/or prices.
> > > > >
> > > > > This in-line ECC appears to be a colossal kludge, presented as
> > > > > a feature, to solve a problem that never should have existed.
> > > > >
> > > > > > https://www.freepatentsonline.com/y2019/0332469.html
> > > > > >
> > > > > > which should have been rejected, because it describes good methods to implement
> > > > > > In-Band ECC, but which are completely obvious and are exactly like anyone would
> > > > > > implement it if given the task, without any other prior knowledge.
> > > > >
> > > > >
> > > >
> > > > From the startup log on an M1 boot...
> > > > 10.281418 AppleFireStormErrorHandler AppleARM64ErrorHandler: will not panic on correctible ECC errors
> > > >
> > >
> > > That is nice not to panic on correctible ECC error (instead
> > > of panic-ing, just correct the error and log the address).
> > > Obviously a correctible ECC error on a protected memory area (i.e. even the OS
> > > cannot read) would need to panic if the correction is not done in hardware.
> > > Now the question is also, do you only correct the read value (risking un-correctible error
> > > if another bit error appears on that address), or do you write-back the correction?
> > > Next step is not to panic on uncorrectible ECC error, just reload if possible,
> > > or kill only the task affected (or only the virtual machine affected).
> >
> > It all remains unclear.
> > One possibility is that this refers PURELY to ECC in caches (so not especially interesting).
> >
> > Another is that there's the possibility for in-line ECC but this is not yet hooked up?
> >
> > Another is that there is genuine in-line ECC, working exactly as you would
> > hope (and presumably with RAS functionality being added over time, eg even
> > a non-correctable error in a block of memory that also exists on disk).
> >
> > That's all even apart from the issue of how the OS intervenes. Hopefully
> > over the next year or so people will figure out more details.
>
> LPDDR doesn't have ECC, so I am skeptical that Apple uses ECC memory.
>
> I think your interpretation that it's ECC on caches is more likely to be correct.
>
> I don't think Apple has the volume to develop non-standard LPDDR memory interfaces
> and modules. But I could be wrong, and it would be awesome if they did.
>
> David

As I keep trying to remind you, with plentiful transistors many clever possibilities become available...
Even with traditional 8bit DRAM at least two options present themselves:

(a) store the ECC in a reserved section of the DRAM. Sure, this presents bandwidth conflicts, but use of an ECC cache in the memory controller should limit the damage.
This is the obvious solution, and likely what Intel are doing.

(b) use memory compression. For example compress each 128-byte line down to ~63bytes + an ECC byte, and store than in RAM. Qualcomm implemented this exact scenario on Falkor. Obviously, exactly as I've described it this gives you probablistic ECC, some lines covered others not. If you're willing to also implement a less aggressive compression you can probably fit most of the remaining lines into 127(or 126) bytes + an ECC byte or two.

At which point, sure, it's not perfect, only probabilistic, and you're not going to sell it as a z/ replacement. But it does give your home system a nice little RAS boost, plus, hopefully some early warning that a DRAM chip is going bad.
(Admittedly, with M1, your options as to what to do about this are limited.
Certainly you'll be able to get a replacement if you're within warranty which is nice; maybe Apple will also provide a rough chipkill that will have the OS in future just avoid the bad chip? Which, if the machine is 5 years old and retired to non-frontline duty, is, what the heck, probably good enough for many purposes though I fully expect plenty of people to complain...)

I honestly don't know where Apple stands on the range of possibilities from "ECC is purely a feature of our on-SoC caches" to "yeah, it's on our radar, one day we'll hook it up to the SoC hardware" to "we have working memory compression+ECC right now, bitches; you just haven't noticed yet".
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
In-band ECC support in recent Atom SoCsGabriele Svelto2020/12/15 04:24 PM
  In-band ECC support in recent Atom SoCsanon2020/12/15 06:40 PM
  In-band ECC support in recent Atom SoCsanon32020/12/15 08:07 PM
  In-band ECC support in recent Atom SoCsEtienne Lorrain2020/12/16 02:48 AM
    In-band ECC support in recent Atom SoCsAdrian2020/12/16 08:43 AM
      ECC in SoCsKonrad Schwarz2020/12/17 08:37 AM
        ECC in SoCsAdrian2020/12/17 09:43 AM
          ECC in SoCsMichael S2020/12/17 01:06 PM
  In-band ECC support in recent Atom SoCs & Tiger Lake UAdrian2020/12/16 08:31 AM
    In-band ECC support in recent Atom SoCs & Tiger Lake UJS2020/12/16 10:07 PM
      In-band ECC support in recent Atom SoCs & Tiger Lake UGabriele Svelto2020/12/16 11:00 PM
        In-band ECC support in recent Atom SoCs & Tiger Lake UJS2020/12/17 01:39 AM
          In-band ECC support in recent Atom SoCs & Tiger Lake UEtienne Lorrain2020/12/17 03:15 AM
            In-band ECC support in recent Atom SoCs & Tiger Lake UJames2020/12/17 08:28 AM
              In-band ECC support in recent Atom SoCs & Tiger Lake UEtienne Lorrain2020/12/17 10:16 AM
                In-band ECC support in recent Atom SoCs & Tiger Lake Urwessel2020/12/17 10:51 AM
                  In-band ECC support in recent Atom SoCs & Tiger Lake UMichael S2020/12/17 01:22 PM
    Enough with the idiocy ... let's have proper ECC again.Björn Ragnar Björnsson2020/12/16 11:09 PM
      Enough with the idiocy ... let's have proper ECC again.Maxwell2020/12/17 01:58 AM
        Enough with the idiocy ... let's have proper ECC again.pixiespeed2020/12/17 10:04 AM
      Enough with the idiocy ... let's have proper ECC again.Adrian2020/12/17 08:40 AM
      Enough with the idiocy ... let's have proper ECC again.Maynard Handley2020/12/17 10:09 AM
        Enough with the idiocy ... let's have proper ECC again.Etienne Lorrain2020/12/17 10:26 AM
          Enough with the idiocy ... let's have proper ECC again.Maynard Handley2020/12/17 10:42 AM
            Enough with the idiocy ... let's have proper ECC again.David Kanter2020/12/17 01:04 PM
              Enough with the idiocy ... let's have proper ECC again.Doug S2020/12/17 02:03 PM
              Enough with the idiocy ... let's have proper ECC again.phonon2020/12/17 04:25 PM
                Internal array ECC vs. memory controllerDavid Kanter2020/12/19 11:39 AM
                  Internal array ECC vs. memory controllerJörn Engel2020/12/20 11:42 AM
                    Internal array ECC vs. memory controllerrwessel2020/12/20 11:52 AM
                    Internal array ECC vs. memory controllerDavid Kanter2020/12/20 04:44 PM
              Enough with the idiocy ... let's have proper ECC again.Maynard Handley2020/12/17 05:55 PM
                Enough with the idiocy ... let's have proper ECC again.rwessel2020/12/17 08:34 PM
                  Enough with the idiocy ... let's have proper ECC again.Maynard Handley2020/12/17 10:10 PM
                    Enough with the idiocy ... let's have proper ECC again.Maynard Handley2020/12/17 10:43 PM
                      Enough with the idiocy ... let's have proper ECC again.Maynard Handley2020/12/18 10:30 AM
                        Enough with the idiocy ... let's have proper ECC again.anon22020/12/19 02:00 AM
                          Enough with the idiocy ... let's have proper ECC again.Maynard Handley2020/12/19 11:23 AM
                            Enough with the idiocy ... let's have proper ECC again.anon22020/12/19 04:01 PM
                              Enough with the idiocy ... let's have proper ECC again.Maynard Handley2020/12/19 05:23 PM
                                Enough with the idiocy ... let's have proper ECC again.anon22020/12/19 05:30 PM
              Enough with the idiocy ... let's have proper ECC again.Björn Ragnar Björnsson2020/12/17 08:41 PM
                Enough with the idiocy ... let's have proper ECC again.David Hess2020/12/19 09:48 PM
              Enough with the idiocy ... let's have proper ECC again.Memory Guy2020/12/17 10:19 PM
      Enough with the idiocy ... let's have proper ECC again.rwessel2020/12/17 11:01 AM
      Enough with the idiocy ... let's have proper ECC again.Wes Felter2020/12/18 10:38 PM
        Thanks for the confirmation!David Kanter2020/12/19 12:51 PM
          Thanks for the confirmation!Konrad Schwarz2020/12/20 10:34 AM
            Thanks for the confirmation!Niels Jørgen Kruse2020/12/20 12:01 PM
              Thanks for the confirmation!David Kanter2020/12/20 04:45 PM
              Thanks for the confirmation!Gionatan Danti2020/12/21 01:50 AM
                Thanks for the confirmation!Niels Jørgen Kruse2020/12/21 10:07 AM
            Thanks for the confirmation!David Kanter2020/12/20 04:42 PM
              Thanks for the confirmation!Foo_2020/12/21 03:01 AM
                Thanks for the confirmation!David Kanter2020/12/21 09:39 AM
            Thanks for the confirmation!Paul2020/12/21 12:29 AM
              Thanks for the confirmation!Michael S2020/12/21 02:00 AM
                Thanks for the confirmation!anon20202020/12/21 02:44 AM
                Thanks for the confirmation!Paul2020/12/22 01:42 PM
                  Thanks for the confirmation!Michael S2020/12/22 03:28 PM
                    Thanks for the confirmation!Paul2020/12/22 07:12 PM
                      Thanks for the confirmation!Michael S2020/12/23 03:55 PM
                        Thanks for the confirmation!Paul2020/12/23 04:54 PM
                          Thanks for the confirmation!Dan Fay2020/12/23 05:38 PM
                            Thanks for the confirmation!Paul2020/12/26 05:10 AM
                              Thanks for the confirmation!Björn Ragnar Björnsson2020/12/26 09:37 PM
                                Thanks for the confirmation!anon22020/12/27 03:00 AM
                                Thanks for the confirmation!Doug S2020/12/28 01:47 PM
            Thanks for the confirmation!David Hess2020/12/21 07:35 PM
              Thanks for the confirmation!Konrad Schwarz2020/12/22 01:08 AM
                Thanks for the confirmation!Doug S2020/12/22 11:42 AM
                  Thanks for the confirmation!David Hess2020/12/22 01:32 PM
                Thanks for the confirmation!David Hess2020/12/22 01:21 PM
        Enough with the idiocy ... let's have proper ECC again.Björn Ragnar Björnsson2020/12/19 05:25 PM
          Enough with the idiocy ... let's have proper ECC again.Brett2020/12/19 09:13 PM
            Enough with the idiocy ... let's have proper ECC again.David Hess2020/12/19 10:17 PM
              Enough with the idiocy ... let's have proper ECC again.Konrad Schwarz2020/12/21 04:29 AM
                Enough with the idiocy ... let's have proper ECC again.David Hess2020/12/21 07:49 PM
            Enough with the idiocy ... let's have proper ECC again.Björn Ragnar Björnsson2020/12/19 10:57 PM
              Enough with the idiocy ... let's have proper ECC again.Björn Ragnar Björnsson2020/12/19 11:14 PM
            Enough with the idiocy ... let's have proper ECC again.Adrian2020/12/20 03:06 AM
              Enough with the idiocy ... let's have proper ECC again.rwessel2020/12/20 09:43 AM
             Multi-level DRAM?Brett2020/12/20 09:07 PM
               Multi-level DRAM?Heikki Kultala2020/12/21 12:58 PM
               Multi-level DRAM?David Hess2020/12/21 08:25 PM
                 Multi-level DRAM?Adrian2020/12/22 06:15 AM
                   Multi-level DRAM?Dan Fay2020/12/22 11:11 AM
                     Multi-level DRAM?Paul2020/12/22 07:01 PM
                       Multi-level DRAM?Dan Fay2020/12/23 01:29 PM
                         Multi-level DRAM?Paul2020/12/23 02:00 PM
                           Multi-level DRAM?Dan Fay2020/12/23 05:30 PM
                             Multi-level DRAM?David Hess2020/12/23 06:05 PM
                           Multi-level DRAM?Björn Ragnar Björnsson2020/12/25 07:44 PM
                             Multi-level DRAM?Paul2020/12/26 05:04 AM
                               Multi-level DRAM?Michael S2020/12/26 09:11 AM
                                 DIMM binsPaul2020/12/26 09:55 AM
                                   DIMM binsBjörn Ragnar Björnsson2020/12/26 09:09 PM
                                     DIMM binsBjörn Ragnar Björnsson2020/12/26 09:19 PM
                                       DIMM binsDaniel Fay2020/12/27 08:51 PM
                                  Is binning at the module or die level? (NT)anonymous22020/12/27 03:36 PM
                                    Is binning at the module or die level?David Hess2020/12/28 02:31 PM
                               Multi-level DRAM?Doug S2020/12/28 01:55 PM
                               Multi-level DRAM?David Hess2020/12/28 02:36 PM
                             Multi-level DRAM?anon­­32020/12/26 11:22 PM
                               Multi-level DRAM?Björn Ragnar Björnsson2020/12/27 08:12 PM
                               Multi-level DRAM?Paul2021/01/04 05:20 AM
               Multi-level DRAM?Carson2021/01/05 01:14 PM
                 Multi-level DRAM?Brett2021/01/05 03:05 PM
        Enough with the idiocy ... let's have proper ECC again.Björn Ragnar Björnsson2020/12/19 05:35 PM
        Enough with the idiocy ... let's have proper ECC again.David Hess2020/12/19 09:59 PM
          Enough with the idiocy ... let's have proper ECC again.rwessel2020/12/20 09:56 AM
          Enough with the idiocy ... let's have proper ECC again.Doug S2020/12/20 11:16 AM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell tangerine? 🍊