By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), August 10, 2019 3:15 pm
Room: Moderated Discussions
anonymou5 (no.delete@this.spam.com) on August 10, 2019 2:52 pm wrote:
> > Zen 2 with the rdrand bug
>
> Is Family 17h actually affected?
Yes. It might be (slightly) different from an earlier AMD rdrand issue, but yes, people definitely had problems on Zen 2. There's a microcode fix - it was apparently something stupid (again) where the random number generation wasn't properly initialized, and presumably the fix was simple.
See for example
heise.de Ryzen 3000 rdrand article
which is in German, but google translate makes good sense of it.
Basically, the broken AMD 'rdrand' returns a value of all-ones (which is not really all that random), but still claims that it's all working fine (by setting CF to 1). It's a serious functionality bug, and the fact that something like it has now happened at least twice is pretty sad.
Note that afaik there's no kernel bugzilla entry for the Zen 2 rdrand issue - it didn't affect the kernel. We do use rdrand, but not generally in a long loop, and we don't trust the result implicitly (so it gets used to initialize entropy data, and a completely broken rdrand does possibly weaken random data generation, but since we also have other sources of entropy, you're generally going to have a hard time really notice broken rdrand for the kernel (famous last words).
Don't use broken CPU's and trust the resulting keys regardless. The kernel is bring pretty careful, but not all tools necessarily are. The old AMD rdrand issue hit openssl, and the Zen 2 one hit at least systemd.
It really doesn't seem that bad in general - just very embarrassing, and showing that there was a decided lack of test coverage for rdrand inside of AMD.
If that's the only problem Zen 2 has, it would be a very good sign for AMD. But it's all new enough that we obviously don't have a ton of independent test coverage yet.
Linus
> > Zen 2 with the rdrand bug
>
> Is Family 17h actually affected?
Yes. It might be (slightly) different from an earlier AMD rdrand issue, but yes, people definitely had problems on Zen 2. There's a microcode fix - it was apparently something stupid (again) where the random number generation wasn't properly initialized, and presumably the fix was simple.
See for example
heise.de Ryzen 3000 rdrand article
which is in German, but google translate makes good sense of it.
Basically, the broken AMD 'rdrand' returns a value of all-ones (which is not really all that random), but still claims that it's all working fine (by setting CF to 1). It's a serious functionality bug, and the fact that something like it has now happened at least twice is pretty sad.
Note that afaik there's no kernel bugzilla entry for the Zen 2 rdrand issue - it didn't affect the kernel. We do use rdrand, but not generally in a long loop, and we don't trust the result implicitly (so it gets used to initialize entropy data, and a completely broken rdrand does possibly weaken random data generation, but since we also have other sources of entropy, you're generally going to have a hard time really notice broken rdrand for the kernel (famous last words).
Don't use broken CPU's and trust the resulting keys regardless. The kernel is bring pretty careful, but not all tools necessarily are. The old AMD rdrand issue hit openssl, and the Zen 2 one hit at least systemd.
It really doesn't seem that bad in general - just very embarrassing, and showing that there was a decided lack of test coverage for rdrand inside of AMD.
If that's the only problem Zen 2 has, it would be a very good sign for AMD. But it's all new enough that we obviously don't have a ton of independent test coverage yet.
Linus