By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), August 11, 2019 9:43 am
Room: Moderated Discussions
Gionatan Danti (g.danti.delete@this.assyoma.it) on August 11, 2019 1:35 am wrote:
>
> I was under the impression that kernel can be configured to trust RDRAND: https://lwn.net/Articles/760584/
>
> Am I missing something?
Yes and no.
Even when we "trust" the rdrand instruction, that just means that we trust it to help fill the entropy pool sufficiently that we don't need any other sources of entropy.
It doesn't mean that we take the output of rdrand at face value, and it doesn't mean that we stop using other sources of entropy. We'll still mix in things like interrupt timing data, and we'll still mix in other system-specific sources of data into the initial entropy pool.
So on the affected AMD machines, the kernel will basically over-estimate the amount of "true randomness" entropy we have, but the numbers you get from /dev/random will still be practically random on a PC.
Note that "on a PC" part. There are problem cases where it's actually hard to find practically any sources of entropy at boot time. We've had situations where you have embedded devices that don't even have a real-time clock, and that don't have cycle counters, and where the initial memory is all identical over millions of identical devices.
What happens then? Because of the lack of even a cycle counter and because the machines are identical (embedded market is like that), you don't even really have any source of timing jitter. So every single entropy pool starts out identical (or nearly so).
And then some of those embedded vendors noticed the problem and said "Ok, we can't use /dev/random, because that one will block until we get enough entropy, so we'll use /dev/urandom instead". Which is the one that is explicitly not for generating safe keys, exactly because it won't wait for enough system noise to happen.
On a PC, even if you don't have rdrand at all, the kernel ends up doing 'rdtsc' and mixing in cycle counter noise at interrupt time etc, and when the entropy pool gets initialized with things like hardware details etc, you will have an initial entropy pool that is effectively completely random anyway.
And notice how we don't actually expose the entropy pool itself. That's just the seed of the randomness that gets exposed. The actual random numbers that get exposed are from cryptographic hashes from the pool, so even if you control the pool partially, that doesn't really help you. You need to control the entropy pool pretty much completely - and see above how even when we "trust" rdrand, that isn't the case.
Also note that most distributions will further add their own sources or entropy from previous boots (again, the kernel doesn't really trust that, but it gets mixed in with the pool). So even on the embedded devices, the randomness tends to be a problem for "first boot" kind of situations, but sadly, that is exactly when some embedded things then want to create UUID's and host keys etc.
So can this be a problem? Yes. It has been a problem on non-PC platforms. There's a lot of cheap and simply not very good hardware out there, and security and randomness is hard. And so there have been various embedded routers etc that used bad entropy to then generate keys that weren't really sufficiently random (maybe they weren't all actually identical but you could see patterns in them).
But in practice, the Zen 2 issue isn't really a big deal (for the kernel) because of where it is used. But do you want to have the fixed firmware? Oh yes you do. I'm not trying to downplay the bug of the RDRAND instruction itself - it's a huge CPU bug, and it's very embarrassing, and AMD should make damn sure that they add proper tests for this so that it never ever happens to them again.
And yes, you should also perhaps ask yourself why this happened at all? Did somebody ask AMD to have an insecure mode? Or was it just purely complete incompetence? How random are the rdrand numbers really even when they don't just show up as obviously the same value?
Hope that clarified things.
Linus
>
> I was under the impression that kernel can be configured to trust RDRAND: https://lwn.net/Articles/760584/
>
> Am I missing something?
Yes and no.
Even when we "trust" the rdrand instruction, that just means that we trust it to help fill the entropy pool sufficiently that we don't need any other sources of entropy.
It doesn't mean that we take the output of rdrand at face value, and it doesn't mean that we stop using other sources of entropy. We'll still mix in things like interrupt timing data, and we'll still mix in other system-specific sources of data into the initial entropy pool.
So on the affected AMD machines, the kernel will basically over-estimate the amount of "true randomness" entropy we have, but the numbers you get from /dev/random will still be practically random on a PC.
Note that "on a PC" part. There are problem cases where it's actually hard to find practically any sources of entropy at boot time. We've had situations where you have embedded devices that don't even have a real-time clock, and that don't have cycle counters, and where the initial memory is all identical over millions of identical devices.
What happens then? Because of the lack of even a cycle counter and because the machines are identical (embedded market is like that), you don't even really have any source of timing jitter. So every single entropy pool starts out identical (or nearly so).
And then some of those embedded vendors noticed the problem and said "Ok, we can't use /dev/random, because that one will block until we get enough entropy, so we'll use /dev/urandom instead". Which is the one that is explicitly not for generating safe keys, exactly because it won't wait for enough system noise to happen.
On a PC, even if you don't have rdrand at all, the kernel ends up doing 'rdtsc' and mixing in cycle counter noise at interrupt time etc, and when the entropy pool gets initialized with things like hardware details etc, you will have an initial entropy pool that is effectively completely random anyway.
And notice how we don't actually expose the entropy pool itself. That's just the seed of the randomness that gets exposed. The actual random numbers that get exposed are from cryptographic hashes from the pool, so even if you control the pool partially, that doesn't really help you. You need to control the entropy pool pretty much completely - and see above how even when we "trust" rdrand, that isn't the case.
Also note that most distributions will further add their own sources or entropy from previous boots (again, the kernel doesn't really trust that, but it gets mixed in with the pool). So even on the embedded devices, the randomness tends to be a problem for "first boot" kind of situations, but sadly, that is exactly when some embedded things then want to create UUID's and host keys etc.
So can this be a problem? Yes. It has been a problem on non-PC platforms. There's a lot of cheap and simply not very good hardware out there, and security and randomness is hard. And so there have been various embedded routers etc that used bad entropy to then generate keys that weren't really sufficiently random (maybe they weren't all actually identical but you could see patterns in them).
But in practice, the Zen 2 issue isn't really a big deal (for the kernel) because of where it is used. But do you want to have the fixed firmware? Oh yes you do. I'm not trying to downplay the bug of the RDRAND instruction itself - it's a huge CPU bug, and it's very embarrassing, and AMD should make damn sure that they add proper tests for this so that it never ever happens to them again.
And yes, you should also perhaps ask yourself why this happened at all? Did somebody ask AMD to have an insecure mode? Or was it just purely complete incompetence? How random are the rdrand numbers really even when they don't just show up as obviously the same value?
Hope that clarified things.
Linus