By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), May 23, 2022 4:54 pm
Room: Moderated Discussions
Brendan (btrotter.delete@this.gmail.com) on May 23, 2022 1:13 pm wrote:
>
> Mental masturbation is things like circular logic - e.g. "I don't want to support anything except the common
> case, because the common case is useless, because I didn't want to support anything except the common case".
It's not about me supporting it.
The kernel side is fairly trivial. It ranges from "no changes at all" (ie users just do their own CPU affinity to deal with it) to "minimal changes" (some ELF flag to say "start with this affinity") to fairly straightforward bigger support (eg "fault-on-use and auto-affine the thread").
In fact, when I first heard of Intel's heterogeneous model in Alder Lake, I was like "we can support that easily".
Because on the kernel side, it really is mostly a non-issue. Any kernel use of AVX512 is already very limited (I think we have a couple of optimized crypto library functions), and the kernel already obviously supports CPU affinities. It's stupid special-case code, but it's not necessarily complicated stupid special-case code.
(Of course, anything to do with the x86 extended FP state is actually fairly complicated to begin with, because of how it's all oddly lumped together in "xstate" and has about a billion different variations, so adding m ore special cases to that code is never a good thing).
So no. My argument is not at all "I don't want to support it", and you haven't heard that argument here in this thread.
My argument is "it's stupid and doesn't work in user space, and any silicon that implements that heterogeneous model is just wasted space by hardware designers who couldn't do it right".
Because in practice that heterogeneous model means that 99% of users will never use that AVX512 hardware, since 99% of users are all in libraries, and I hope I have explained why they would not use it.
And that "99% of users wouldn't use it at all" is for a feature that already doesn't have very many users to begin with, because it's already fairly specialized. Compiler people think auto-vectorization is common and a big deal. Outside of very special cases it's neither. So a questionably useful feature thus becomes completely useless because you realistically can't use it in the one situation where it's most useful.
I'd much rather have Intel give people more cache, more cores, or higher frequencies than give me a terminally broken heterogeneous AVX512 system.
Linus
>
> Mental masturbation is things like circular logic - e.g. "I don't want to support anything except the common
> case, because the common case is useless, because I didn't want to support anything except the common case".
It's not about me supporting it.
The kernel side is fairly trivial. It ranges from "no changes at all" (ie users just do their own CPU affinity to deal with it) to "minimal changes" (some ELF flag to say "start with this affinity") to fairly straightforward bigger support (eg "fault-on-use and auto-affine the thread").
In fact, when I first heard of Intel's heterogeneous model in Alder Lake, I was like "we can support that easily".
Because on the kernel side, it really is mostly a non-issue. Any kernel use of AVX512 is already very limited (I think we have a couple of optimized crypto library functions), and the kernel already obviously supports CPU affinities. It's stupid special-case code, but it's not necessarily complicated stupid special-case code.
(Of course, anything to do with the x86 extended FP state is actually fairly complicated to begin with, because of how it's all oddly lumped together in "xstate" and has about a billion different variations, so adding m ore special cases to that code is never a good thing).
So no. My argument is not at all "I don't want to support it", and you haven't heard that argument here in this thread.
My argument is "it's stupid and doesn't work in user space, and any silicon that implements that heterogeneous model is just wasted space by hardware designers who couldn't do it right".
Because in practice that heterogeneous model means that 99% of users will never use that AVX512 hardware, since 99% of users are all in libraries, and I hope I have explained why they would not use it.
And that "99% of users wouldn't use it at all" is for a feature that already doesn't have very many users to begin with, because it's already fairly specialized. Compiler people think auto-vectorization is common and a big deal. Outside of very special cases it's neither. So a questionably useful feature thus becomes completely useless because you realistically can't use it in the one situation where it's most useful.
I'd much rather have Intel give people more cache, more cores, or higher frequencies than give me a terminally broken heterogeneous AVX512 system.
Linus