By: ⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com), May 22, 2022 10:51 am
Room: Moderated Discussions
Linus Torvalds (torvalds.delete@this.linux-foundation.org) on May 21, 2022 4:58 pm wrote:
> That second case requires a working and reliable CPUID bit that doesn't cause the
> code to either go ridiculously slowly (emulation) or get relegated to just a subset
> of the cores in the system (trap-and-migrate or explicit affinities).
I don't want to write a reaction to the whole heterogenous-x86-cores discussion, because it is obvious that you and I are deeply in disagreement. Instead, I would like to briefly mention the following argument:
Binary translation isn't "ridiculously slow". People who claim that emulation is slow are most likely thinking about a basic emulation algorithm/method without any code translation caches.
Considering the fact that you worked for Transmeta, I fail to understand why you claim that emulation is "ridiculously slow". It isn't. (I presume that "ridiculously slow" means "2 times slower or worse" or something like that.)
The Linux kernel is what it is: there is no "advanced native" support for binary translation in the Linux kernel. If the kernel already supported it then it would be easier to run heterogeneous x86 apps in Linux because an AVX-512 app would be able to run on Alder Lake E-cores (with a reasonable performance penalty, and in case the performance penalty was measured - by the kernel - to be unreasonable then the kernel would try to pin the app to Alder Lake's P-cores). If there are idle P-cores available and the CPU performance governor isn't set to powersave, there is little need to run a process on an E-core.
You are overly protective of what the Linux kernel currently is. There is no vision of a future of heterogeneous CPUs in your posts .... if heterogeneous desktop/notebook CPUs are inevitable then you should have a plan for it or make a plan for it. (An example reason why heterogeneous CPUs are inevitable in those markets is that endowing _all_ cores in a future desktop machine with the ability to predict 4 branches per cycle would be problematic.)
-atom
> That second case requires a working and reliable CPUID bit that doesn't cause the
> code to either go ridiculously slowly (emulation) or get relegated to just a subset
> of the cores in the system (trap-and-migrate or explicit affinities).
I don't want to write a reaction to the whole heterogenous-x86-cores discussion, because it is obvious that you and I are deeply in disagreement. Instead, I would like to briefly mention the following argument:
Binary translation isn't "ridiculously slow". People who claim that emulation is slow are most likely thinking about a basic emulation algorithm/method without any code translation caches.
Considering the fact that you worked for Transmeta, I fail to understand why you claim that emulation is "ridiculously slow". It isn't. (I presume that "ridiculously slow" means "2 times slower or worse" or something like that.)
The Linux kernel is what it is: there is no "advanced native" support for binary translation in the Linux kernel. If the kernel already supported it then it would be easier to run heterogeneous x86 apps in Linux because an AVX-512 app would be able to run on Alder Lake E-cores (with a reasonable performance penalty, and in case the performance penalty was measured - by the kernel - to be unreasonable then the kernel would try to pin the app to Alder Lake's P-cores). If there are idle P-cores available and the CPU performance governor isn't set to powersave, there is little need to run a process on an E-core.
You are overly protective of what the Linux kernel currently is. There is no vision of a future of heterogeneous CPUs in your posts .... if heterogeneous desktop/notebook CPUs are inevitable then you should have a plan for it or make a plan for it. (An example reason why heterogeneous CPUs are inevitable in those markets is that endowing _all_ cores in a future desktop machine with the ability to predict 4 branches per cycle would be problematic.)
-atom