memory errors

By: dmcq (dmcq.delete@this.fano.co.uk), March 5, 2021 12:51 pm
Room: Moderated Discussions
rwessel (rwessel.delete@this.yahoo.com) on March 5, 2021 12:23 pm wrote:
> dmcq (dmcq.delete@this.fano.co.uk) on March 5, 2021 12:01 pm wrote:
> > Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on March 5, 2021 6:23 am wrote:
> > > Carson (carson.delete@this.example.edu) on March 5, 2021 2:31 am wrote:
> > > > Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on March 4, 2021 6:58 am wrote:
> > > > > You cannot "only initialising a page when it was allocated" because the BIOS do not allocate
> > > > > memory, it just tells you how much memory is available, and the concept of page is not known
> > > > > (no virtual memory at that point). Moreover the page size is decided by the OS, processor support
> > > > > different sizes. Moreover memory over 1 Megabyte was special, either EMS or XMS or HMA.
> > > > > And you could not modify DOS, proprietary software with no source.
> > > > > The real missing thing was (and still partly is) a proper DMA, able to send more than 64 Kbytes
> > > > > and able to access more than 1 Megabyte / 16 Mbytes with chipset recognition and NDA docs.
> > > >
> > > > These all seem like non-issues.
> > > >
> > > > For OSes which cannot handle uninitialized ECC, have a forward-compatibility flag somewhere
> > > > in the boot loader which means "I can handle uninitialized ECC". If it's not present, BIOS
> > > > clears memory before transferring to the boot loader. A chaining boot loader (LILO or whatever)
> > > > is required to do the same check on OSes is chain-loads. (By calling back into the BIOS which
> > > > has all the necessary code, so the chain loader doesn't suffer much code bloat.)
> > > >
> > > > For loaders which do support uninitialized ECC, a simple BIOS data structure (like the 0xE820 memory
> > > > map) describes the initialized parts, and there are BIOS calls to extend the initialized parts.
> > > >
> > >
> > > You make me feel old, that is bad on Friday -:)
> > > BIOS was developed at a time where Linux was a sin, you would be made immediately redundant
> > > if you installed it on a PC at work (because there was no anti-virus on Linux).
> > > Because DOS did not manage ECC at all, and Windows 3.1/95 started from
> > > DOS, the BIOS had to initialise all the memory it knew about.
> > > DOS did not know about EMS or XMS memory, nothing more than 1 Mbyte.
> > > EMS could be provided by an ISA card (i.e. something like PCI), and video/network card
> > > BIOS would be on the ISA/PCI card itself, their BIOS did not manage ECC either.
> > >
> > > > The loader asks the BIOS to initialize enough to hold whatever it's chain-loading,
> > > > and that loader does the same for its own bss and initial stack.
> > > >
> > > > Then the OS's early boot probes the hardware to find out
> > > > if it's capable of talking to the ECC hardware without
> > > > BIOS support. If not, before the BIOS is fully disabled, fall back to asking it to initialize everything.
> > > >
> > > > With all of those fallbacks implemented, the hopefully common case is that the OS does know how to drive
> > > > the ECC hardware and it initializes its own memory allocator with all unallocated memory flagged "ECC bad".
> > > > The first time that memory is allocated, it is initialized (which may be as simple as CLZERO).
> > > >
> > > > Since the OS is doing all of this, it understands the page size and all the necessary rules. The important
> > > > point is that the initialization is done lazily, after applications have started running.
> > > >
> > > > A more sophisticated OS might extend the idea of "potentially
> > > > uninitialized page" beyond the memory allocator
> > > > proper, and things like disk DMA which are going to overwrite
> > > > the whole page could elide the initialization.
> > > > Since this overlaps heavily with the zeroing required for security, it's not actually a major project.
> > > >
> > > > This all seems like a pretty straightforward SMOP.
> > >
> > > Nowadays things have changed, the EFI system is a lot more complicated, its whole behaviour/specification
> > > is under NDA (Non Disclosure Agreement), and secure boot will stop you implementing anything
> > > which is not approved by the manufacturer - or not supported by Window 10.
> > > So basically none of your sophisticated idea can be implemented, if ECC has to be supported
> > > it has to be done by the EFI BIOS (Windows10 do not want to hear about ECC), and you get
> > > what Linus was complaining about: a machine-check exception with no way to know even the
> > > processor which triggered the ECC error or the address which caused the problem.
> > > That is what people call progress: a nice background image telling you how intelligent you
> > > are by having bought this top of the range brand of PC, during the whole Windows boot time.
> >
> > Linux only started being a thing at the same time as Windows came out and anti-virus wasn't at all common
> > before then either. There was absolutely no problem about having a BIOS that supported ECC or parity checks
> > for the first megabyte and a himem controller setting a standard for anything more. It sounds to me like
> > you think Microsoft DOS and Windows was partly to blame by not providing support even for parity checks.
> > That seems unlikely to me since the early IBM PCs and lots of others did support parity checks.
>
>
> At least early versions of DOS did nothing with the parity check. The BIOS did
> set up an NMI handler a display a message ("PARITY CHECK x") end end with a CLI/HLT.
> It wouldn't surprise me if later versions of DOS did their own handler.

Far better I think than proceeding with errors. One could then clean all the contacts and make sure everything was seated properly and run a memory check - and possibly replace the memory. Yep a lot of the time the error would be invisible - but I'd prefer stuff being stored away periodically and having to do a bit of work again rather than accumulating strange errors.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
CPU & Memory bit flipsGanon2021/03/03 09:05 AM
  Also "Silent Data Corruption"Adrian2021/03/03 10:42 AM
    Thanks for the referenceGanon2021/03/03 11:47 AM
  Implications for linux page cacheanon2021/03/03 11:54 AM
    Implications for linux page cacheLinus Torvalds2021/03/03 01:54 PM
      memory errorsblaine2021/03/03 02:53 PM
        memory errorsanon22021/03/03 05:30 PM
          memory errorsdmcq2021/03/04 05:16 AM
            memory errorsEtienne Lorrain2021/03/04 06:26 AM
              memory errorsdmcq2021/03/04 06:40 AM
                memory errorsEtienne Lorrain2021/03/04 06:58 AM
                  memory errorsdmcq2021/03/04 07:12 AM
                  memory errorsCarson2021/03/05 02:31 AM
                    memory errorsEtienne Lorrain2021/03/05 06:23 AM
                      memory errorsrwessel2021/03/05 07:48 AM
                      memory errorsdmcq2021/03/05 12:01 PM
                        memory errorsrwessel2021/03/05 12:23 PM
                          memory errorsdmcq2021/03/05 12:51 PM
                      memory errorsBrendan2021/03/05 11:38 PM
                      memory errorsCarson2021/03/06 01:35 AM
                        memory errorsCarson2021/03/06 06:24 AM
                memory errorsDavid Hess2021/03/04 01:44 PM
                  memory errorsrwessel2021/03/04 05:14 PM
                  memory errorsLinus Torvalds2021/03/04 08:21 PM
                    memory errorsanon22021/03/04 09:46 PM
                      memory errorsCarson2021/03/05 02:43 AM
                        memory errorsanon22021/03/05 07:55 AM
                    memory errorsgallier22021/03/05 02:22 AM
                  memory errorsdmcq2021/03/05 12:59 PM
                    memory errorsDavid Hess2021/03/06 04:27 AM
                      memory errorsCarson2021/03/06 06:44 AM
                      memory errorsGabriele Svelto2021/03/06 10:11 AM
                        memory errorsDavid Hess2021/03/06 10:28 AM
                          memory errorsMichael S2021/03/06 02:45 PM
              memory errorsDoug S2021/03/04 10:48 AM
                memory errorsMichael S2021/03/04 11:36 AM
              memory errorsJörn Engel2021/03/04 03:32 PM
                memory errorsLinus Torvalds2021/03/04 08:47 PM
                  memory errorsEtienne Lorrain2021/03/05 01:09 AM
                  memory errorsMichael S2021/03/05 04:06 AM
                    memory errorsLinus Torvalds2021/03/05 11:59 AM
                      memory errorsrwessel2021/03/05 12:32 PM
                        memory errorsrwessel2021/03/05 12:37 PM
                        memory errorszArchJon2021/03/06 08:39 PM
                      memory errorsGabriele Svelto2021/03/06 12:58 PM
                  memory errorsJörn Engel2021/03/05 10:12 AM
                Amiga recoverable RAM disk?Carson2021/03/05 03:03 AM
                  Thanks - TIL a cool Amiga feature (nt) (NT)John2021/03/05 12:51 PM
                    Another cool Amiga feature, datatypesCharles2021/03/06 12:01 AM
                      Another cool Amiga feature, datatypesJukka Larja2021/03/06 01:23 AM
                      Another cool Amiga feature, datatypesAnon2021/03/06 12:40 PM
                      Another cool Amiga feature, filesystemsMarcus2021/03/07 12:28 AM
  CPU & Memory bit flipszArchJon2021/03/04 06:39 AM
    CPU & Memory bit flipsdmcq2021/03/04 06:59 AM
      CPU & Memory bit flipsrwessel2021/03/04 12:27 PM
  speak of the devilRobert Williams2021/03/05 07:53 AM
    speak of the devildmcq2021/03/05 11:26 AM
      speak of the devilRobert Williams2021/03/05 03:15 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?