By: Gabriele Svelto (gabriele.svelto.delete@this.gmail.com), January 4, 2021 1:29 am
Room: Moderated Discussions
Dan Strother (dan.strother.delete@this.gmail.com) on January 3, 2021 7:00 pm wrote:
> Note that even if the CPU supports ECC error injection, the
> BIOS may disable it. See this post for some context:
> https://hardwarecanucks.com/forum/threads/ecc-memory-amds-ryzen-a-deep-dive-comment-thread.75041/page-6#post-902700
>
> In that post, the poster is trying to use MemTest86 Pro (the paid PassMark version, not the free one)
> to inject errors on their Ryzen 3000 system with an ASRock Rack motherboard. They had to change the "Disable
> Memory Error Injection" option in the motherboard's BIOS to enable injection. Unfortunately, they weren't
> able to confirm that it was actually working - errors appeared to be injected, but then went unreported
> (even worse, ASRock Rack support then claimed that ECC reporting wasn't supported at all!).
That's an interesting post! They claim that AMD's official response is that "AM4 does not support ECC error reporting function" which is exactly the opposite of what I saw on my machine with both Ryzen 2xxx and 3xxx processors. Errors are reported correctly via machine check exceptions so neither the motherboard nor BIOS should possibly interfere with that once early memory setup is finished and ECC functionality is enabled.
Maybe that means that AM4 motherboards don't support reporting errors in the BIOS or over IPMI?
> Note that even if the CPU supports ECC error injection, the
> BIOS may disable it. See this post for some context:
> https://hardwarecanucks.com/forum/threads/ecc-memory-amds-ryzen-a-deep-dive-comment-thread.75041/page-6#post-902700
>
> In that post, the poster is trying to use MemTest86 Pro (the paid PassMark version, not the free one)
> to inject errors on their Ryzen 3000 system with an ASRock Rack motherboard. They had to change the "Disable
> Memory Error Injection" option in the motherboard's BIOS to enable injection. Unfortunately, they weren't
> able to confirm that it was actually working - errors appeared to be injected, but then went unreported
> (even worse, ASRock Rack support then claimed that ECC reporting wasn't supported at all!).
That's an interesting post! They claim that AMD's official response is that "AM4 does not support ECC error reporting function" which is exactly the opposite of what I saw on my machine with both Ryzen 2xxx and 3xxx processors. Errors are reported correctly via machine check exceptions so neither the motherboard nor BIOS should possibly interfere with that once early memory setup is finished and ECC functionality is enabled.
Maybe that means that AM4 motherboards don't support reporting errors in the BIOS or over IPMI?