By: Andrey (andrey.semashev.delete@this.gmail.com), May 22, 2022 4:18 pm
Room: Moderated Discussions
Brendan (btrotter.delete@this.gmail.com) on May 22, 2022 11:18 am wrote:
> Hi,
>
> Linus Torvalds (torvalds.delete@this.linux-foundation.org) on May 21, 2022 4:58 pm wrote:
> > Brendan (btrotter.delete@this.gmail.com) on May 21, 2022 12:58 pm wrote:
> > >
> > > No. New software (designed to use the new CPUID leaves) would be aware of that problem
> > > and would avoid it - e.g. maybe using something like "sched_setaffinity()" to lock the
> > > thread to a specific CPU type before using CPUID (and maybe using "sched_setaffinity()"
> > > again later to restore the original CPU affinity and allow migration again).
> >
> > That's the "it works in an embedded environment where you control everything" model.
>
> Erm, no?
>
> It's the run-time dispatch (e.g. choosing which version of functions to use based on CPUID results)
> that some compilers (ICC) have been doing for ages; but slightly modified to work with dissimilar
> cores by making it more fine grained (not once at program startup, but "anytime where needed")
> and preventing scheduler from migrating to a different CPU type at the wrong time.
That's not how libraries work. Your typical library will test CPU features once on load, initialization or the first call and save the pointer(s) to the selected implementation. After that the library can be called multiple times, from any threads running on any cores, and the library will use the saved pointers. So (a) even if you lock the affinity while running CPU detection, that doesn't help because the library will be used on any cores, and (b) locking the affinity permanently (not just for the duration of CPU detection) most of the time is not expected by the caller and is not an acceptable behavior of the library. Doing CPU detection on every use is also not acceptable because doing this is slow - especially, if adjusting thread affinity is involved.
> It's the "developer controls nothing, generic app designed for any/all 80x86
> CPUs adapts to whatever it happens to find itself running on" model (the opposite
> of the "embedded environment where you control everything" model).
Per the above, this approach cannot work in general libraries. It may work in an application that is tightly coupled with the libraries it uses, and is able to compartmentalize threads and libraries to specific cores. Most applications don't do that and cannot reasonably do that because performance implications of such work distribution are unclear and unpredictable unless you're the only application running on the system.
> Hi,
>
> Linus Torvalds (torvalds.delete@this.linux-foundation.org) on May 21, 2022 4:58 pm wrote:
> > Brendan (btrotter.delete@this.gmail.com) on May 21, 2022 12:58 pm wrote:
> > >
> > > No. New software (designed to use the new CPUID leaves) would be aware of that problem
> > > and would avoid it - e.g. maybe using something like "sched_setaffinity()" to lock the
> > > thread to a specific CPU type before using CPUID (and maybe using "sched_setaffinity()"
> > > again later to restore the original CPU affinity and allow migration again).
> >
> > That's the "it works in an embedded environment where you control everything" model.
>
> Erm, no?
>
> It's the run-time dispatch (e.g. choosing which version of functions to use based on CPUID results)
> that some compilers (ICC) have been doing for ages; but slightly modified to work with dissimilar
> cores by making it more fine grained (not once at program startup, but "anytime where needed")
> and preventing scheduler from migrating to a different CPU type at the wrong time.
That's not how libraries work. Your typical library will test CPU features once on load, initialization or the first call and save the pointer(s) to the selected implementation. After that the library can be called multiple times, from any threads running on any cores, and the library will use the saved pointers. So (a) even if you lock the affinity while running CPU detection, that doesn't help because the library will be used on any cores, and (b) locking the affinity permanently (not just for the duration of CPU detection) most of the time is not expected by the caller and is not an acceptable behavior of the library. Doing CPU detection on every use is also not acceptable because doing this is slow - especially, if adjusting thread affinity is involved.
> It's the "developer controls nothing, generic app designed for any/all 80x86
> CPUs adapts to whatever it happens to find itself running on" model (the opposite
> of the "embedded environment where you control everything" model).
Per the above, this approach cannot work in general libraries. It may work in an application that is tightly coupled with the libraries it uses, and is able to compartmentalize threads and libraries to specific cores. Most applications don't do that and cannot reasonably do that because performance implications of such work distribution are unclear and unpredictable unless you're the only application running on the system.