By: Terry Gray (cuyahogan.delete@this.aol.com), January 7, 2021 9:47 am
Room: Moderated Discussions
Jörn Engel (joern.delete@this.purestorage.com) on January 7, 2021 9:05 am wrote:
> Emanuel Rylke (ema.delete@this.mailbox.org) on January 7, 2021 12:49 am wrote:
> >
> > What about doing it not as a workaround for broken hardware
> > but to make it more easy to show that the hardware
> > is broken? In theory I know that I'm probably getting bit
> > errors and that's bad(TM) but if a cat /proc/bit_errors
> > showed me that I got at least 5 since boot I would be much more motivated to do something about it.
>
> Unrealistic for writable pages. Doable for read-only. You need a bit of shadow memory to store
> the checksums and some fast hash function. Assuming you cannot use vector instructions, performance
> would be 16 bytes per cycle or 256 cycles per page. You should calculate hashes when pages turn
> read-only, again before they become writable and maybe periodically in between.
>
> Do you care enough to write a patch?
Back in the 1960s Oregon State University had a CDC 3300 (24 bit computer).
Some of the other students I shared an office with wrote an operating system for it called OS3
(Oregon State Open Shop Operating System).
It had parity memory and to recover from errors in progrem code each sector had an exclusive OR
of the contents as the last word in a sector. When an error occurred the word in error was known so they could calculate what that word should have been. So this idea is not new. But interesting that I have never heard of it being used anywhere else (although it may have been).
Terry
> Emanuel Rylke (ema.delete@this.mailbox.org) on January 7, 2021 12:49 am wrote:
> >
> > What about doing it not as a workaround for broken hardware
> > but to make it more easy to show that the hardware
> > is broken? In theory I know that I'm probably getting bit
> > errors and that's bad(TM) but if a cat /proc/bit_errors
> > showed me that I got at least 5 since boot I would be much more motivated to do something about it.
>
> Unrealistic for writable pages. Doable for read-only. You need a bit of shadow memory to store
> the checksums and some fast hash function. Assuming you cannot use vector instructions, performance
> would be 16 bytes per cycle or 256 cycles per page. You should calculate hashes when pages turn
> read-only, again before they become writable and maybe periodically in between.
>
> Do you care enough to write a patch?
Back in the 1960s Oregon State University had a CDC 3300 (24 bit computer).
Some of the other students I shared an office with wrote an operating system for it called OS3
(Oregon State Open Shop Operating System).
It had parity memory and to recover from errors in progrem code each sector had an exclusive OR
of the contents as the last word in a sector. When an error occurred the word in error was known so they could calculate what that word should have been. So this idea is not new. But interesting that I have never heard of it being used anywhere else (although it may have been).
Terry