By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), March 20, 2021 1:20 pm
Room: Moderated Discussions
Hugo Décharnes (hdecharn.delete@this.outlook.fr) on March 20, 2021 7:34 am wrote:
> Having programs delivered in annotated, intermediate representation (IR) would be great.
In theory.
Not necessarily in practice.
Not only do you often have a big optimization problem (look at everybody who has tried it: they pretty much always ended up having a "native mode" fallback for games), but you have a very nasty testing problem, because often the IR is designed by compiler people, and those people are more than happy to talk about "undefined behavior" etc.
So the end result will generally be rather under-defined, and then different hardware, and different recompilers will result in different behavior.
That happens even with real hardware, but it at least tends to happen less. Partly because the HW people have actually mostly learnt from their mistakes, while in many areas the compiler people have been even more open to "undefined behavior" in the name of performance.
Java (and wasm) is actually doing pretty damn well. It's unusually well-specified for an IR (despite issues), it works, it's used, and it's fine.
But the more fundamental problem is that real applications don't want a "CPU architecture". They want a whole machine. They want high-performance display drivers, they want a windowing system (or they want the windowing system to keep out as in the case of most games), they want access to random odd peripherals that you might have never even heard of, but people use.
So what you propose is already reality on the web. And guess what? Lots of people use web apps. But they don't use them for high-performance stuff (yeah, you can game in web apps, and people do, but you know what I'm talking about), and they don't use them for a lot of specialized stuff that needs drivers etc.
People often forget how small part the instruction set of a CPU really plays. The reason x86-64 is dominant on PC's isn't the instruction set, it's everything around it. The reason Apple has been successful in transitions before (and looks to be again) is because they control exactly that "everything around it".
The instruction set just isn't as important as people claim. Not even within the CPU, but even less so when you start talking about the bigger picture. Yes, a lot of that infrastructure then does tend to be tied to the instruction set, because modern technology is all so interconnected (ie all the drivers were built for an OS that was built for an instruction set).
But that's a general "everything is interconnected" issue, and the instruction set is not a particularly central player there.
So my point is that you might as well state it the other way around - with not the instruction set as the central point, but the apps people use as the central point, and how they depend on driver and library infrastructure, that in turn depends on operating system infrastructure etc.
To get back to the subject line: I think the whole "radically different CPU ISA" question is completely pointless, exactly because it's the least interesting problem in the whole application stack. If you can't give advantages to existing applications that people actually use, you're already holding the wrong end of the stick.
I see too many people using simple "look, I can optimize this matrix multiplication loop" as examples of radical new CPU instruction sets. To a very high approximation, absolutely nobody cares. Because it's such a pointless and insignificantly small part of the thing that actually matters.
When designing that radical new instruction set, explain how it would help your average Mac or Windows user (Linux is the honey badger of operating systems, and Linux don't care. You can recompile most things, but please realize that it's the exception. And even then you need to have a damn good story about why regular C code will work fine with your instruction set, which will already be a hurdle for some of the more exotic ideas I've seen).
If your new instruction set starts off with needing a new OS, new libraries, new drivers, and new applications, you better also have a billion "new users" you can provide. Because otherwise you're kind of stuck.
Linus
> Having programs delivered in annotated, intermediate representation (IR) would be great.
In theory.
Not necessarily in practice.
Not only do you often have a big optimization problem (look at everybody who has tried it: they pretty much always ended up having a "native mode" fallback for games), but you have a very nasty testing problem, because often the IR is designed by compiler people, and those people are more than happy to talk about "undefined behavior" etc.
So the end result will generally be rather under-defined, and then different hardware, and different recompilers will result in different behavior.
That happens even with real hardware, but it at least tends to happen less. Partly because the HW people have actually mostly learnt from their mistakes, while in many areas the compiler people have been even more open to "undefined behavior" in the name of performance.
Java (and wasm) is actually doing pretty damn well. It's unusually well-specified for an IR (despite issues), it works, it's used, and it's fine.
But the more fundamental problem is that real applications don't want a "CPU architecture". They want a whole machine. They want high-performance display drivers, they want a windowing system (or they want the windowing system to keep out as in the case of most games), they want access to random odd peripherals that you might have never even heard of, but people use.
So what you propose is already reality on the web. And guess what? Lots of people use web apps. But they don't use them for high-performance stuff (yeah, you can game in web apps, and people do, but you know what I'm talking about), and they don't use them for a lot of specialized stuff that needs drivers etc.
People often forget how small part the instruction set of a CPU really plays. The reason x86-64 is dominant on PC's isn't the instruction set, it's everything around it. The reason Apple has been successful in transitions before (and looks to be again) is because they control exactly that "everything around it".
The instruction set just isn't as important as people claim. Not even within the CPU, but even less so when you start talking about the bigger picture. Yes, a lot of that infrastructure then does tend to be tied to the instruction set, because modern technology is all so interconnected (ie all the drivers were built for an OS that was built for an instruction set).
But that's a general "everything is interconnected" issue, and the instruction set is not a particularly central player there.
So my point is that you might as well state it the other way around - with not the instruction set as the central point, but the apps people use as the central point, and how they depend on driver and library infrastructure, that in turn depends on operating system infrastructure etc.
To get back to the subject line: I think the whole "radically different CPU ISA" question is completely pointless, exactly because it's the least interesting problem in the whole application stack. If you can't give advantages to existing applications that people actually use, you're already holding the wrong end of the stick.
I see too many people using simple "look, I can optimize this matrix multiplication loop" as examples of radical new CPU instruction sets. To a very high approximation, absolutely nobody cares. Because it's such a pointless and insignificantly small part of the thing that actually matters.
When designing that radical new instruction set, explain how it would help your average Mac or Windows user (Linux is the honey badger of operating systems, and Linux don't care. You can recompile most things, but please realize that it's the exception. And even then you need to have a damn good story about why regular C code will work fine with your instruction set, which will already be a hurdle for some of the more exotic ideas I've seen).
If your new instruction set starts off with needing a new OS, new libraries, new drivers, and new applications, you better also have a billion "new users" you can provide. Because otherwise you're kind of stuck.
Linus