By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), January 30, 2017 11:01 am
Room: Moderated Discussions
rwessel (robertwessel.delete@this.yahoo.com) on January 29, 2017 8:56 pm wrote:
>
> That's at least partially untrue for zArch. Context switches are very common in MVS*, as many things happen
> via calls to other address spaces (most often mediated by the zArch equivalent of a call gate**). zArch's
> TLBs (and specifically the two level TLBs) are there to avoid needing TLB reloads precisely because address
> space context switches have been increasing significantly in frequency for the last three decades.
Ahh. But afaik s390 at least has address space identifiers, so it's not as deadly for the TLB contents as the old win32 gdi behavior was.
> Much of GDI in Widnows was moved back into the kernel, back in the NT4 days
Oh yes, the insane "TLB switches _all_ the time" behavior was purely a win32 issue, I don't think NT ever had it at least under normal load.
So it's long gone, but the legacy lives on in microarchitectural details.
Also, x86 these days actually has ASIDs (although they are called PCID, which I personally find a particularly annoying NIH acronym, but every single time I read it as "PCI" something). I think they ended up evolving out of having the virtualization support (single-bit tag in the TLB to avoid TLB misses on each VM exit/entry).
We don't actually use it in Linux because it wasn't a noticeable win, probably exactly because the TLB is aggressively prefetched, and outside of virtualization there is very seldom very frequent TLB switches.
(Even the classic "ping pong scheduling benchmark" still runs a lot of code in between TLB flushes, and is not very common behavior anyway. It really takes crazy "function calls between address spaces" to get very high TLB switch behavior).
> As to how the page tables are organized, it's very little like POWER, and quite similar to x86 in gross
> terms (every detail is different, of course).
Yeah, I'm, aware of the high-level details, the oddities are in the low-level issues (very strange TLB flushing, iirc, due to some odd dirty bit handling and other rules, I only see the patches flow by, I've never used it or looked all that closely).
> While fast TLB reloads are certainly a good thing, they are still an expense, avoiding unnecessary discards
> of TLB entries, especially on short switches to different address spaces, compliments that.
No disagreement. Although I personally find that to be more of a SW design issue: people who design their calling conventions to be about cross-process boundaries are crazy.
Linus
>
> That's at least partially untrue for zArch. Context switches are very common in MVS*, as many things happen
> via calls to other address spaces (most often mediated by the zArch equivalent of a call gate**). zArch's
> TLBs (and specifically the two level TLBs) are there to avoid needing TLB reloads precisely because address
> space context switches have been increasing significantly in frequency for the last three decades.
Ahh. But afaik s390 at least has address space identifiers, so it's not as deadly for the TLB contents as the old win32 gdi behavior was.
> Much of GDI in Widnows was moved back into the kernel, back in the NT4 days
Oh yes, the insane "TLB switches _all_ the time" behavior was purely a win32 issue, I don't think NT ever had it at least under normal load.
So it's long gone, but the legacy lives on in microarchitectural details.
Also, x86 these days actually has ASIDs (although they are called PCID, which I personally find a particularly annoying NIH acronym, but every single time I read it as "PCI" something). I think they ended up evolving out of having the virtualization support (single-bit tag in the TLB to avoid TLB misses on each VM exit/entry).
We don't actually use it in Linux because it wasn't a noticeable win, probably exactly because the TLB is aggressively prefetched, and outside of virtualization there is very seldom very frequent TLB switches.
(Even the classic "ping pong scheduling benchmark" still runs a lot of code in between TLB flushes, and is not very common behavior anyway. It really takes crazy "function calls between address spaces" to get very high TLB switch behavior).
> As to how the page tables are organized, it's very little like POWER, and quite similar to x86 in gross
> terms (every detail is different, of course).
Yeah, I'm, aware of the high-level details, the oddities are in the low-level issues (very strange TLB flushing, iirc, due to some odd dirty bit handling and other rules, I only see the patches flow by, I've never used it or looked all that closely).
> While fast TLB reloads are certainly a good thing, they are still an expense, avoiding unnecessary discards
> of TLB entries, especially on short switches to different address spaces, compliments that.
No disagreement. Although I personally find that to be more of a SW design issue: people who design their calling conventions to be about cross-process boundaries are crazy.
Linus