By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), May 24, 2022 4:33 pm
Room: Moderated Discussions
⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com) on May 24, 2022 3:12 pm wrote:
>
> "binary translation being a replacement for static code generation" - I suppose I never wrote,
> nor implied, such a statement. Binary translation is (in simplified terms) from static-form-1
> to static-form-2.
I'm not really seeing what your argument is.
If your argument is that the system can use different code sequences for something depending on CPU features, and just dynamically rewriting it, then you don't seem to realize that that is a very old thing for Linux. We've done it for ages, where we rewrite our text segment depending on various CPU features (including "is this SMP" when we can drop 'lock' prefixes, but also things like "use the most efficient sequence for memcpy() and friends").
You definitely talked about binary translation, and we have both the ebpf form and the "static form1 to form2" kind.
But we do it globally, because doing it per-cpu (ie different cores would use different sequences) would be entirely insane and stupid and generate absolutely horrible code.
So no, using generated code (whether static or JIT) to solve heterogeneous systems does not fix anything at all. It only exposes how broken said heterogeneous systems are. Doing percpu code would be an exercise in either code duplication (insane) or in nasty indirection (also insane).
So what you seem to propose is just a bad idea. And it's not a bad idea because of runtime code generation (which we do), but because it's simply expensive and stupid.
And the source of said stupidity: heterogeneous hardware is bad. Don't do it.
(Now, heterogeneous hardware in the sense of having completely different execution environments for different compute units - that's fine. That's the "Use a GPU for very parallel loads" model, or the "Use an AI accelerator for your AI loads". That's fine. That's what you should do. But multi-core systems with ostensibly the same architecture but with small ISA differences? Bad, bad, bad).
Linus
>
> "binary translation being a replacement for static code generation" - I suppose I never wrote,
> nor implied, such a statement. Binary translation is (in simplified terms) from static-form-1
> to static-form-2.
I'm not really seeing what your argument is.
If your argument is that the system can use different code sequences for something depending on CPU features, and just dynamically rewriting it, then you don't seem to realize that that is a very old thing for Linux. We've done it for ages, where we rewrite our text segment depending on various CPU features (including "is this SMP" when we can drop 'lock' prefixes, but also things like "use the most efficient sequence for memcpy() and friends").
You definitely talked about binary translation, and we have both the ebpf form and the "static form1 to form2" kind.
But we do it globally, because doing it per-cpu (ie different cores would use different sequences) would be entirely insane and stupid and generate absolutely horrible code.
So no, using generated code (whether static or JIT) to solve heterogeneous systems does not fix anything at all. It only exposes how broken said heterogeneous systems are. Doing percpu code would be an exercise in either code duplication (insane) or in nasty indirection (also insane).
So what you seem to propose is just a bad idea. And it's not a bad idea because of runtime code generation (which we do), but because it's simply expensive and stupid.
And the source of said stupidity: heterogeneous hardware is bad. Don't do it.
(Now, heterogeneous hardware in the sense of having completely different execution environments for different compute units - that's fine. That's the "Use a GPU for very parallel loads" model, or the "Use an AI accelerator for your AI loads". That's fine. That's what you should do. But multi-core systems with ostensibly the same architecture but with small ISA differences? Bad, bad, bad).
Linus