By: Michael S (already5chosen.delete@this.yahoo.com), March 4, 2021 12:36 pm
Room: Moderated Discussions
Doug S (foo.delete@this.bar.bar) on March 4, 2021 10:48 am wrote:
> Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on March 4, 2021 6:26 am wrote:
> > dmcq (dmcq.delete@this.fano.co.uk) on March 4, 2021 5:16 am wrote:
> > > ...
> > >
> > > I think the important thing is error detection - not recovery. Error recovery at a low level is nice to
> > > have but if the whole business can be fixed at a higher level and the error rate is low enough it is not
> > > really necessary. Intel leaving out ECC was dreadful, the
> > > thing that I think was really criminal and cretinous
> > > though was cutting out even parity checking. I see it as a cheap trick to obscure errors so people just
> > > blamed gremlins and pressed ctrl-alt-delete rather than fixing underlyng problems. Of course some memory
> > > problems would escape that but it would catch memory that is failing and it would give an indication of
> > > how reliable it is overall.
> >
> > Historically, I think ECC error detection was removed approximately at the time it took too much time to
> > initialise the memory. At power-up, the parity bit is not
> > initialised: if you dump the DDR before initialisation
> > you get mostly zero bits but you will also get bits set (I do not know why the capacitor is still charged
> > at power-up). If you do a quick power-cycle, it is obvious you will still have bits set.
> > When the memory of the PC increased to few tens of megabytes, the CPU (at that time) was
> > not able to clear that DRAM (so initialise the ECC) in less than 10 seconds, and the PC never
> > had a powerful DMA to do such work. To cut boot time, they removed the parity bit.
>
>
> That would have been easy enough to fix in the next generation of memory (i.e. when they went
> from FBDIMM to SDRAM or whatever) by adding a reset pin that would be triggered on power on
> that would cause the entire chip to be zeroed without the CPU having to take part. All the
> chips could do it in parallel, so whatever the worst case time (which could be defined in the
> standard as a requirement) was would apply regardless of how much memory was installed.
It's not that simple.
There are trade offs between speed of zeroing and consumed current that are quite different in system vs 2 memory ranks vs one with, for example, 16 ranks.
Individual memory chip do not see a whole picture so has no chance to make an optimal decision.
Host-controlled "zeroize row" command looks like better idea.
Isn't something like that already available GDDR SDRAM?
> Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on March 4, 2021 6:26 am wrote:
> > dmcq (dmcq.delete@this.fano.co.uk) on March 4, 2021 5:16 am wrote:
> > > ...
> > >
> > > I think the important thing is error detection - not recovery. Error recovery at a low level is nice to
> > > have but if the whole business can be fixed at a higher level and the error rate is low enough it is not
> > > really necessary. Intel leaving out ECC was dreadful, the
> > > thing that I think was really criminal and cretinous
> > > though was cutting out even parity checking. I see it as a cheap trick to obscure errors so people just
> > > blamed gremlins and pressed ctrl-alt-delete rather than fixing underlyng problems. Of course some memory
> > > problems would escape that but it would catch memory that is failing and it would give an indication of
> > > how reliable it is overall.
> >
> > Historically, I think ECC error detection was removed approximately at the time it took too much time to
> > initialise the memory. At power-up, the parity bit is not
> > initialised: if you dump the DDR before initialisation
> > you get mostly zero bits but you will also get bits set (I do not know why the capacitor is still charged
> > at power-up). If you do a quick power-cycle, it is obvious you will still have bits set.
> > When the memory of the PC increased to few tens of megabytes, the CPU (at that time) was
> > not able to clear that DRAM (so initialise the ECC) in less than 10 seconds, and the PC never
> > had a powerful DMA to do such work. To cut boot time, they removed the parity bit.
>
>
> That would have been easy enough to fix in the next generation of memory (i.e. when they went
> from FBDIMM to SDRAM or whatever) by adding a reset pin that would be triggered on power on
> that would cause the entire chip to be zeroed without the CPU having to take part. All the
> chips could do it in parallel, so whatever the worst case time (which could be defined in the
> standard as a requirement) was would apply regardless of how much memory was installed.
It's not that simple.
There are trade offs between speed of zeroing and consumed current that are quite different in system vs 2 memory ranks vs one with, for example, 16 ranks.
Individual memory chip do not see a whole picture so has no chance to make an optimal decision.
Host-controlled "zeroize row" command looks like better idea.
Isn't something like that already available GDDR SDRAM?
Topic | Posted By | Date |
---|---|---|
CPU & Memory bit flips | Ganon | 2021/03/03 10:05 AM |
Also "Silent Data Corruption" | Adrian | 2021/03/03 11:42 AM |
Thanks for the reference | Ganon | 2021/03/03 12:47 PM |
Implications for linux page cache | anon | 2021/03/03 12:54 PM |
Implications for linux page cache | Linus Torvalds | 2021/03/03 02:54 PM |
memory errors | blaine | 2021/03/03 03:53 PM |
memory errors | anon2 | 2021/03/03 06:30 PM |
memory errors | dmcq | 2021/03/04 06:16 AM |
memory errors | Etienne Lorrain | 2021/03/04 07:26 AM |
memory errors | dmcq | 2021/03/04 07:40 AM |
memory errors | Etienne Lorrain | 2021/03/04 07:58 AM |
memory errors | dmcq | 2021/03/04 08:12 AM |
memory errors | Carson | 2021/03/05 03:31 AM |
memory errors | Etienne Lorrain | 2021/03/05 07:23 AM |
memory errors | rwessel | 2021/03/05 08:48 AM |
memory errors | dmcq | 2021/03/05 01:01 PM |
memory errors | rwessel | 2021/03/05 01:23 PM |
memory errors | dmcq | 2021/03/05 01:51 PM |
memory errors | Brendan | 2021/03/06 12:38 AM |
memory errors | Carson | 2021/03/06 02:35 AM |
memory errors | Carson | 2021/03/06 07:24 AM |
memory errors | David Hess | 2021/03/04 02:44 PM |
memory errors | rwessel | 2021/03/04 06:14 PM |
memory errors | Linus Torvalds | 2021/03/04 09:21 PM |
memory errors | anon2 | 2021/03/04 10:46 PM |
memory errors | Carson | 2021/03/05 03:43 AM |
memory errors | anon2 | 2021/03/05 08:55 AM |
memory errors | gallier2 | 2021/03/05 03:22 AM |
memory errors | dmcq | 2021/03/05 01:59 PM |
memory errors | David Hess | 2021/03/06 05:27 AM |
memory errors | Carson | 2021/03/06 07:44 AM |
memory errors | Gabriele Svelto | 2021/03/06 11:11 AM |
memory errors | David Hess | 2021/03/06 11:28 AM |
memory errors | Michael S | 2021/03/06 03:45 PM |
memory errors | Doug S | 2021/03/04 11:48 AM |
memory errors | Michael S | 2021/03/04 12:36 PM |
memory errors | Jörn Engel | 2021/03/04 04:32 PM |
memory errors | Linus Torvalds | 2021/03/04 09:47 PM |
memory errors | Etienne Lorrain | 2021/03/05 02:09 AM |
memory errors | Michael S | 2021/03/05 05:06 AM |
memory errors | Linus Torvalds | 2021/03/05 12:59 PM |
memory errors | rwessel | 2021/03/05 01:32 PM |
memory errors | rwessel | 2021/03/05 01:37 PM |
memory errors | zArchJon | 2021/03/06 09:39 PM |
memory errors | Gabriele Svelto | 2021/03/06 01:58 PM |
memory errors | Jörn Engel | 2021/03/05 11:12 AM |
Amiga recoverable RAM disk? | Carson | 2021/03/05 04:03 AM |
Thanks - TIL a cool Amiga feature (nt) (NT) | John | 2021/03/05 01:51 PM |
Another cool Amiga feature, datatypes | Charles | 2021/03/06 01:01 AM |
Another cool Amiga feature, datatypes | Jukka Larja | 2021/03/06 02:23 AM |
Another cool Amiga feature, datatypes | Anon | 2021/03/06 01:40 PM |
Another cool Amiga feature, filesystems | Marcus | 2021/03/07 01:28 AM |
CPU & Memory bit flips | zArchJon | 2021/03/04 07:39 AM |
CPU & Memory bit flips | dmcq | 2021/03/04 07:59 AM |
CPU & Memory bit flips | rwessel | 2021/03/04 01:27 PM |
speak of the devil | Robert Williams | 2021/03/05 08:53 AM |
speak of the devil | dmcq | 2021/03/05 12:26 PM |
speak of the devil | Robert Williams | 2021/03/05 04:15 PM |