memory errors

By: rwessel (rwessel.delete@this.yahoo.com), March 5, 2021 12:23 pm
Room: Moderated Discussions
dmcq (dmcq.delete@this.fano.co.uk) on March 5, 2021 12:01 pm wrote:
> Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on March 5, 2021 6:23 am wrote:
> > Carson (carson.delete@this.example.edu) on March 5, 2021 2:31 am wrote:
> > > Etienne Lorrain (etienne_lorrain.delete@this.yahoo.fr) on March 4, 2021 6:58 am wrote:
> > > > You cannot "only initialising a page when it was allocated" because the BIOS do not allocate
> > > > memory, it just tells you how much memory is available, and the concept of page is not known
> > > > (no virtual memory at that point). Moreover the page size is decided by the OS, processor support
> > > > different sizes. Moreover memory over 1 Megabyte was special, either EMS or XMS or HMA.
> > > > And you could not modify DOS, proprietary software with no source.
> > > > The real missing thing was (and still partly is) a proper DMA, able to send more than 64 Kbytes
> > > > and able to access more than 1 Megabyte / 16 Mbytes with chipset recognition and NDA docs.
> > >
> > > These all seem like non-issues.
> > >
> > > For OSes which cannot handle uninitialized ECC, have a forward-compatibility flag somewhere
> > > in the boot loader which means "I can handle uninitialized ECC". If it's not present, BIOS
> > > clears memory before transferring to the boot loader. A chaining boot loader (LILO or whatever)
> > > is required to do the same check on OSes is chain-loads. (By calling back into the BIOS which
> > > has all the necessary code, so the chain loader doesn't suffer much code bloat.)
> > >
> > > For loaders which do support uninitialized ECC, a simple BIOS data structure (like the 0xE820 memory
> > > map) describes the initialized parts, and there are BIOS calls to extend the initialized parts.
> > >
> >
> > You make me feel old, that is bad on Friday -:)
> > BIOS was developed at a time where Linux was a sin, you would be made immediately redundant
> > if you installed it on a PC at work (because there was no anti-virus on Linux).
> > Because DOS did not manage ECC at all, and Windows 3.1/95 started from
> > DOS, the BIOS had to initialise all the memory it knew about.
> > DOS did not know about EMS or XMS memory, nothing more than 1 Mbyte.
> > EMS could be provided by an ISA card (i.e. something like PCI), and video/network card
> > BIOS would be on the ISA/PCI card itself, their BIOS did not manage ECC either.
> >
> > > The loader asks the BIOS to initialize enough to hold whatever it's chain-loading,
> > > and that loader does the same for its own bss and initial stack.
> > >
> > > Then the OS's early boot probes the hardware to find out
> > > if it's capable of talking to the ECC hardware without
> > > BIOS support. If not, before the BIOS is fully disabled, fall back to asking it to initialize everything.
> > >
> > > With all of those fallbacks implemented, the hopefully common case is that the OS does know how to drive
> > > the ECC hardware and it initializes its own memory allocator with all unallocated memory flagged "ECC bad".
> > > The first time that memory is allocated, it is initialized (which may be as simple as CLZERO).
> > >
> > > Since the OS is doing all of this, it understands the page size and all the necessary rules. The important
> > > point is that the initialization is done lazily, after applications have started running.
> > >
> > > A more sophisticated OS might extend the idea of "potentially
> > > uninitialized page" beyond the memory allocator
> > > proper, and things like disk DMA which are going to overwrite
> > > the whole page could elide the initialization.
> > > Since this overlaps heavily with the zeroing required for security, it's not actually a major project.
> > >
> > > This all seems like a pretty straightforward SMOP.
> >
> > Nowadays things have changed, the EFI system is a lot more complicated, its whole behaviour/specification
> > is under NDA (Non Disclosure Agreement), and secure boot will stop you implementing anything
> > which is not approved by the manufacturer - or not supported by Window 10.
> > So basically none of your sophisticated idea can be implemented, if ECC has to be supported
> > it has to be done by the EFI BIOS (Windows10 do not want to hear about ECC), and you get
> > what Linus was complaining about: a machine-check exception with no way to know even the
> > processor which triggered the ECC error or the address which caused the problem.
> > That is what people call progress: a nice background image telling you how intelligent you
> > are by having bought this top of the range brand of PC, during the whole Windows boot time.
>
> Linux only started being a thing at the same time as Windows came out and anti-virus wasn't at all common
> before then either. There was absolutely no problem about having a BIOS that supported ECC or parity checks
> for the first megabyte and a himem controller setting a standard for anything more. It sounds to me like
> you think Microsoft DOS and Windows was partly to blame by not providing support even for parity checks.
> That seems unlikely to me since the early IBM PCs and lots of others did support parity checks.


At least early versions of DOS did nothing with the parity check. The BIOS did set up an NMI handler a display a message ("PARITY CHECK x") end end with a CLI/HLT. It wouldn't surprise me if later versions of DOS did their own handler.
< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
CPU & Memory bit flipsGanon2021/03/03 09:05 AM
  Also "Silent Data Corruption"Adrian2021/03/03 10:42 AM
    Thanks for the referenceGanon2021/03/03 11:47 AM
  Implications for linux page cacheanon2021/03/03 11:54 AM
    Implications for linux page cacheLinus Torvalds2021/03/03 01:54 PM
      memory errorsblaine2021/03/03 02:53 PM
        memory errorsanon22021/03/03 05:30 PM
          memory errorsdmcq2021/03/04 05:16 AM
            memory errorsEtienne Lorrain2021/03/04 06:26 AM
              memory errorsdmcq2021/03/04 06:40 AM
                memory errorsEtienne Lorrain2021/03/04 06:58 AM
                  memory errorsdmcq2021/03/04 07:12 AM
                  memory errorsCarson2021/03/05 02:31 AM
                    memory errorsEtienne Lorrain2021/03/05 06:23 AM
                      memory errorsrwessel2021/03/05 07:48 AM
                      memory errorsdmcq2021/03/05 12:01 PM
                        memory errorsrwessel2021/03/05 12:23 PM
                          memory errorsdmcq2021/03/05 12:51 PM
                      memory errorsBrendan2021/03/05 11:38 PM
                      memory errorsCarson2021/03/06 01:35 AM
                        memory errorsCarson2021/03/06 06:24 AM
                memory errorsDavid Hess2021/03/04 01:44 PM
                  memory errorsrwessel2021/03/04 05:14 PM
                  memory errorsLinus Torvalds2021/03/04 08:21 PM
                    memory errorsanon22021/03/04 09:46 PM
                      memory errorsCarson2021/03/05 02:43 AM
                        memory errorsanon22021/03/05 07:55 AM
                    memory errorsgallier22021/03/05 02:22 AM
                  memory errorsdmcq2021/03/05 12:59 PM
                    memory errorsDavid Hess2021/03/06 04:27 AM
                      memory errorsCarson2021/03/06 06:44 AM
                      memory errorsGabriele Svelto2021/03/06 10:11 AM
                        memory errorsDavid Hess2021/03/06 10:28 AM
                          memory errorsMichael S2021/03/06 02:45 PM
              memory errorsDoug S2021/03/04 10:48 AM
                memory errorsMichael S2021/03/04 11:36 AM
              memory errorsJörn Engel2021/03/04 03:32 PM
                memory errorsLinus Torvalds2021/03/04 08:47 PM
                  memory errorsEtienne Lorrain2021/03/05 01:09 AM
                  memory errorsMichael S2021/03/05 04:06 AM
                    memory errorsLinus Torvalds2021/03/05 11:59 AM
                      memory errorsrwessel2021/03/05 12:32 PM
                        memory errorsrwessel2021/03/05 12:37 PM
                        memory errorszArchJon2021/03/06 08:39 PM
                      memory errorsGabriele Svelto2021/03/06 12:58 PM
                  memory errorsJörn Engel2021/03/05 10:12 AM
                Amiga recoverable RAM disk?Carson2021/03/05 03:03 AM
                  Thanks - TIL a cool Amiga feature (nt) (NT)John2021/03/05 12:51 PM
                    Another cool Amiga feature, datatypesCharles2021/03/06 12:01 AM
                      Another cool Amiga feature, datatypesJukka Larja2021/03/06 01:23 AM
                      Another cool Amiga feature, datatypesAnon2021/03/06 12:40 PM
                      Another cool Amiga feature, filesystemsMarcus2021/03/07 12:28 AM
  CPU & Memory bit flipszArchJon2021/03/04 06:39 AM
    CPU & Memory bit flipsdmcq2021/03/04 06:59 AM
      CPU & Memory bit flipsrwessel2021/03/04 12:27 PM
  speak of the devilRobert Williams2021/03/05 07:53 AM
    speak of the devildmcq2021/03/05 11:26 AM
      speak of the devilRobert Williams2021/03/05 03:15 PM
Reply to this Topic
Name:
Email:
Topic:
Body: No Text
How do you spell avocado?