By: Adrian (a.delete@this.acm.org), January 1, 2021 1:28 pm
Room: Moderated Discussions
Jukka Larja (roskakori2006.delete@this.gmail.com) on January 1, 2021 10:43 am wrote:
> Gabriele Svelto (gabriele.svelto.delete@this.gmail.com) on January 1, 2021 7:10 am wrote:
> > me (me.delete@this.me.com) on December 31, 2020 4:56 pm wrote:
> > > > AMD has their actual server CPU line too, and you do pay more for that privilege, but at least
> > > > AMD doesn't try to screw you over and limit their non-server parts. So you do get ECC for Threadripper
> > > > (and plain Ryzen) too, even if it's not necessarily "officially verified".
> > > >
> > >
> > > You would think that for people who want/need ECC, they
> > > are going to want CPUs that are officially verified.
> >
> > What does "officially" mean in this context? All non-APU
> > Ryzen CPUs support ECC if the motherboards have the
> > necessary traces and UEFI support. Motherboard vendors advertise this support quite clearly in the specs.
>
> Trying to google about how well the unofficial support works, I get lot of hits about people saying that
> yes, it works, without any proof. I don't see people with a test DIMMs known to produce single bit errors
> making sure the unofficial support works, or making sure it works in every CPU or at least gives some easy
> to see error somewhere if it doesn't (I'm sure someone somewhere has tested something, but it gets lost
> in the noise. Anecdotes are only useful if there's enough of them to be statistically significant).
>
> I really like what AMD is doing with CPUs, but unofficial ECC support just
> annoys me. It's supposed to give me peace of mind and eliminate one source
> of random problems. "Unofficial" really doesn't work great with that goal.
>
> -JLarja
I am also annoyed by the unofficial ECC support, but the Intel CPUs have become so weak in comparison, that the alternative of using them is even worse.
The tests done by many people, because they are the easiest in the absence of special hardware and in the absence of better documentation from AMD about the test features that probably exist in their controller, consisted in overclocking the memory, using the BIOS settings (if the MB provides this option, which most MBs for AMD do) and verifying that the memory errors are reported (obviously this should better be done while booting from a testing USB stick, without mounting any valuable storage, that might be corrupted if the errors are excessive).
I have also done this overclocking test once in the past, for a Ryzen 7 3700X on a workstation motherboard ASUS Pro WS X570-ACE, and it worked as expected.
Now I have just replaced the 3700X with a 5900X, and ECC seems to work OK starting with the Linux kernel 5.10.
However, I have not repeated yet the memory overclocking test with the new CPU, to see if the errors are really reported, but I intend to do it again when I will have some spare time.
> Gabriele Svelto (gabriele.svelto.delete@this.gmail.com) on January 1, 2021 7:10 am wrote:
> > me (me.delete@this.me.com) on December 31, 2020 4:56 pm wrote:
> > > > AMD has their actual server CPU line too, and you do pay more for that privilege, but at least
> > > > AMD doesn't try to screw you over and limit their non-server parts. So you do get ECC for Threadripper
> > > > (and plain Ryzen) too, even if it's not necessarily "officially verified".
> > > >
> > >
> > > You would think that for people who want/need ECC, they
> > > are going to want CPUs that are officially verified.
> >
> > What does "officially" mean in this context? All non-APU
> > Ryzen CPUs support ECC if the motherboards have the
> > necessary traces and UEFI support. Motherboard vendors advertise this support quite clearly in the specs.
>
> Trying to google about how well the unofficial support works, I get lot of hits about people saying that
> yes, it works, without any proof. I don't see people with a test DIMMs known to produce single bit errors
> making sure the unofficial support works, or making sure it works in every CPU or at least gives some easy
> to see error somewhere if it doesn't (I'm sure someone somewhere has tested something, but it gets lost
> in the noise. Anecdotes are only useful if there's enough of them to be statistically significant).
>
> I really like what AMD is doing with CPUs, but unofficial ECC support just
> annoys me. It's supposed to give me peace of mind and eliminate one source
> of random problems. "Unofficial" really doesn't work great with that goal.
>
> -JLarja
I am also annoyed by the unofficial ECC support, but the Intel CPUs have become so weak in comparison, that the alternative of using them is even worse.
The tests done by many people, because they are the easiest in the absence of special hardware and in the absence of better documentation from AMD about the test features that probably exist in their controller, consisted in overclocking the memory, using the BIOS settings (if the MB provides this option, which most MBs for AMD do) and verifying that the memory errors are reported (obviously this should better be done while booting from a testing USB stick, without mounting any valuable storage, that might be corrupted if the errors are excessive).
I have also done this overclocking test once in the past, for a Ryzen 7 3700X on a workstation motherboard ASUS Pro WS X570-ACE, and it worked as expected.
Now I have just replaced the 3700X with a 5900X, and ECC seems to work OK starting with the Linux kernel 5.10.
However, I have not repeated yet the memory overclocking test with the new CPU, to see if the errors are really reported, but I intend to do it again when I will have some spare time.