By: --- (---.delete@this.redheron.com), September 11, 2021 8:53 am
Room: Moderated Discussions
sr (nobody.delete@this.nowhere.com) on September 11, 2021 1:36 am wrote:
> --- (---.delete@this.redheron.com) on September 8, 2021 2:28 pm wrote:
>
> > Finally I cannot figure out how Apple bypasses D- way prediction, but they don't seem to need
> > it -- at least even the most random stream of addresses (in terms of hitting different ways)
> > that I could generate never seemed to cause even a hiccup in terms of invalid way prediction.
> > I have a few hypotheses, but no strong leads, and no patents suggest an explanation.
> > On the I- side it's easier -- way info is stored in various appropriate places,
> > like, for example, the Next Fetch Predictor -- and this is described in patents.
>
> That' way L1-cache tags need to be in program address space. With physical tags it's impossible to do any
> kind of way prediction with tag bits but with linear address space you could use tag bits itself to do "way
> prediction". AMD is documented how they do it. It's a "way prediction" because it isn't traditional way prediction
> where first is compared predicted way and then rest - what they do is to make tags so short that they can
> compare all ways effectively. So way prediction newer missed, it's either L1 hit or miss.
>
I understand what you are saying. There are multiple ways in which this sort of thing can be done, ranging from placing a way table in the TLB (D2D cache) to using partial tag comparisons (which Apple definitely does for the L2, as in this patent: https://patents.google.com/patent/US10157137B1
to mapping packing of lines into the rows of a subarray.
What I have not established is any particular claim as to how Apple does it in the L1D, just that whatever they do, they manage to hide it from visible code. Standard partial tag comparison is a probabilistic thing so it should occasionally fail and generate a timing glitch. I never saw such -- but Apple have a very nice, lightweight, Replay+Speculative Scheduling mechanism, so with rare enough glitches, they can probably hide such a glitch.
Compare this to throughput based on address patterns, where you can, with carefully crafted patterns, very definitely see the falloff in throughput with different TLB access requirements, requests that all route to one half of the cache, requests separated by varying numbers of cache lines, and so on.
> --- (---.delete@this.redheron.com) on September 8, 2021 2:28 pm wrote:
>
> > Finally I cannot figure out how Apple bypasses D- way prediction, but they don't seem to need
> > it -- at least even the most random stream of addresses (in terms of hitting different ways)
> > that I could generate never seemed to cause even a hiccup in terms of invalid way prediction.
> > I have a few hypotheses, but no strong leads, and no patents suggest an explanation.
> > On the I- side it's easier -- way info is stored in various appropriate places,
> > like, for example, the Next Fetch Predictor -- and this is described in patents.
>
> That' way L1-cache tags need to be in program address space. With physical tags it's impossible to do any
> kind of way prediction with tag bits but with linear address space you could use tag bits itself to do "way
> prediction". AMD is documented how they do it. It's a "way prediction" because it isn't traditional way prediction
> where first is compared predicted way and then rest - what they do is to make tags so short that they can
> compare all ways effectively. So way prediction newer missed, it's either L1 hit or miss.
>
I understand what you are saying. There are multiple ways in which this sort of thing can be done, ranging from placing a way table in the TLB (D2D cache) to using partial tag comparisons (which Apple definitely does for the L2, as in this patent: https://patents.google.com/patent/US10157137B1
to mapping packing of lines into the rows of a subarray.
What I have not established is any particular claim as to how Apple does it in the L1D, just that whatever they do, they manage to hide it from visible code. Standard partial tag comparison is a probabilistic thing so it should occasionally fail and generate a timing glitch. I never saw such -- but Apple have a very nice, lightweight, Replay+Speculative Scheduling mechanism, so with rare enough glitches, they can probably hide such a glitch.
Compare this to throughput based on address patterns, where you can, with carefully crafted patterns, very definitely see the falloff in throughput with different TLB access requirements, requests that all route to one half of the cache, requests separated by varying numbers of cache lines, and so on.
Topic | Posted By | Date |
---|---|---|
POWER10 SAP SD benchmark | anon2 | 2021/09/06 02:36 PM |
POWER10 SAP SD benchmark | Daniel B | 2021/09/07 01:31 AM |
"Cores" (and SPEC) | Rayla | 2021/09/07 06:51 AM |
"Cores" (and SPEC) | anon | 2021/09/07 02:56 PM |
POWER10 SAP SD benchmark | Anon | 2021/09/07 02:24 PM |
POWER10 SAP SD benchmark | Anon | 2021/09/07 02:27 PM |
Virtually tagged L1-caches | sr | 2021/09/08 04:49 AM |
Virtually tagged L1-caches | dmcq | 2021/09/08 07:22 AM |
Virtually tagged L1-caches | sr | 2021/09/08 07:56 AM |
Virtually tagged L1-caches | Hugo Décharnes | 2021/09/08 07:58 AM |
Virtually tagged L1-caches | sr | 2021/09/08 09:09 AM |
Virtually tagged L1-caches | Hugo Décharnes | 2021/09/08 09:46 AM |
Virtually tagged L1-caches | sr | 2021/09/08 10:35 AM |
Virtually tagged L1-caches | Hugo Décharnes | 2021/09/08 11:23 AM |
Virtually tagged L1-caches | sr | 2021/09/08 11:40 AM |
Virtually tagged L1-caches | anon | 2021/09/09 02:16 AM |
Virtually tagged L1-caches | Konrad Schwarz | 2021/09/10 04:19 AM |
Virtually tagged L1-caches | Hugo Décharnes | 2021/09/10 05:59 AM |
Virtually tagged L1-caches | anon | 2021/09/14 02:17 AM |
Virtually tagged L1-caches | dmcq | 2021/09/14 08:34 AM |
Or use a PLB (NT) | Paul A. Clayton | 2021/09/14 08:45 AM |
Or use a PLB | Linus Torvalds | 2021/09/14 02:27 PM |
Or use a PLB | anon | 2021/09/14 11:15 PM |
Or use a PLB | Michael S | 2021/09/15 02:21 AM |
Or use a PLB | dmcq | 2021/09/15 02:42 PM |
Or use a PLB | Konrad Schwarz | 2021/09/16 03:24 AM |
Or use a PLB | Michael S | 2021/09/16 09:13 AM |
Or use a PLB | --- | 2021/09/16 12:02 PM |
PLB reference | Paul A. Clayton | 2021/09/18 01:35 PM |
PLB reference | Michael S | 2021/09/18 03:14 PM |
Demand paging/translation orthogonal | Paul A. Clayton | 2021/09/19 06:33 AM |
Demand paging/translation orthogonal | Michael S | 2021/09/19 08:10 AM |
PLB reference | Carson | 2021/09/20 09:19 PM |
PLB reference | sr | 2021/09/20 05:02 AM |
PLB reference | Michael S | 2021/09/20 06:03 AM |
PLB reference | Linus Torvalds | 2021/09/20 11:10 AM |
Or use a PLB | sr | 2021/09/20 03:32 AM |
Or use a PLB | sr | 2021/09/21 08:36 AM |
Or use a PLB | Linus Torvalds | 2021/09/21 09:04 AM |
Or use a PLB | sr | 2021/09/21 09:48 AM |
Or use a PLB | Linus Torvalds | 2021/09/21 12:55 PM |
Or use a PLB | sr | 2021/09/22 05:55 AM |
Or use a PLB | rwessel | 2021/09/22 06:09 AM |
Or use a PLB | Linus Torvalds | 2021/09/22 10:50 AM |
Or use a PLB | sr | 2021/09/22 12:00 PM |
Or use a PLB | dmcq | 2021/09/22 03:07 PM |
Or use a PLB | Etienne Lorrain | 2021/09/23 07:50 AM |
Or use a PLB | anon2 | 2021/09/22 03:09 PM |
Or use a PLB | dmcq | 2021/09/23 01:35 AM |
Or use a PLB | ⚛ | 2021/09/23 08:37 AM |
Or use a PLB | Linus Torvalds | 2021/09/23 11:01 AM |
Or use a PLB | gpd | 2021/09/24 02:59 AM |
Or use a PLB | Linus Torvalds | 2021/09/24 09:45 AM |
Or use a PLB | dmcq | 2021/09/24 11:43 AM |
Or use a PLB | sr | 2021/09/25 09:19 AM |
Or use a PLB | Linus Torvalds | 2021/09/25 09:44 AM |
Or use a PLB | sr | 2021/09/25 10:11 AM |
Or use a PLB | Linus Torvalds | 2021/09/25 10:31 AM |
Or use a PLB | sr | 2021/09/25 10:52 AM |
Or use a PLB | Linus Torvalds | 2021/09/25 11:05 AM |
Or use a PLB | sr | 2021/09/25 11:23 AM |
Or use a PLB | rwessel | 2021/09/25 02:29 PM |
Or use a PLB | sr | 2021/09/30 11:22 PM |
Or use a PLB | rwessel | 2021/10/01 05:19 AM |
Or use a PLB | David Hess | 2021/10/01 09:35 AM |
Or use a PLB | rwessel | 2021/10/02 03:47 AM |
Or use a PLB | sr | 2021/10/02 10:16 AM |
Or use a PLB | rwessel | 2021/10/02 10:53 AM |
Or use a PLB | Linus Torvalds | 2021/09/25 10:57 AM |
Or use a PLB | sr | 2021/09/25 11:07 AM |
Or use a PLB | Linus Torvalds | 2021/09/25 11:21 AM |
Or use a PLB | sr | 2021/09/25 11:40 AM |
Or use a PLB | nksingh | 2021/09/27 08:07 AM |
Or use a PLB | ⚛ | 2021/09/27 08:02 AM |
Or use a PLB | Linus Torvalds | 2021/09/27 09:20 AM |
Or use a PLB | Linus Torvalds | 2021/09/27 11:58 AM |
Or use a PLB | dmcq | 2021/09/28 09:59 AM |
Or use a PLB | sr | 2021/09/25 09:34 AM |
Or use a PLB | rwessel | 2021/09/25 02:44 PM |
Or use a PLB | sr | 2021/10/01 12:04 AM |
Or use a PLB | rwessel | 2021/10/01 05:33 AM |
I386 segmentation highlights | sr | 2021/10/04 06:53 AM |
I386 segmentation highlights | Adrian | 2021/10/04 08:53 AM |
I386 segmentation highlights | sr | 2021/10/04 09:19 AM |
I386 segmentation highlights | rwessel | 2021/10/04 03:57 PM |
I386 segmentation highlights | sr | 2021/10/05 10:16 AM |
I386 segmentation highlights | Michael S | 2021/10/05 11:27 AM |
I386 segmentation highlights | rwessel | 2021/10/05 03:20 PM |
Or use a PLB | JohnG | 2021/09/25 09:18 PM |
Or use a PLB | ⚛ | 2021/09/27 06:37 AM |
Or use a PLB | Heikki Kultala | 2021/09/28 02:53 AM |
Or use a PLB | rwessel | 2021/09/28 06:29 AM |
Or use a PLB | David Hess | 2021/09/23 05:00 PM |
Or use a PLB | Adrian | 2021/09/24 12:21 AM |
Or use a PLB | dmcq | 2021/09/25 11:41 AM |
Or use a PLB | blaine | 2021/09/26 10:19 PM |
Or use a PLB | David Hess | 2021/09/27 10:35 AM |
Or use a PLB | blaine | 2021/09/27 04:19 PM |
Or use a PLB | Adrian | 2021/09/27 09:40 PM |
Or use a PLB | Adrian | 2021/09/27 09:59 PM |
Or use a PLB | dmcq | 2021/09/28 06:45 AM |
Or use a PLB | rwessel | 2021/09/28 06:45 AM |
Or use a PLB | David Hess | 2021/09/28 11:50 AM |
Or use a PLB | Etienne Lorrain | 2021/09/30 12:25 AM |
Or use a PLB | David Hess | 2021/10/01 09:40 AM |
MMU privileges | sr | 2021/09/21 10:07 AM |
MMU privileges | Linus Torvalds | 2021/09/21 12:49 PM |
Virtually tagged L1-caches | Konrad Schwarz | 2021/09/16 03:18 AM |
Virtually tagged L1-caches | Carson | 2021/09/16 12:12 PM |
Virtually tagged L1-caches | anon2 | 2021/09/16 04:16 PM |
Virtually tagged L1-caches | rwessel | 2021/09/16 05:29 PM |
Virtually tagged L1-caches | sr | 2021/09/20 03:20 AM |
Virtually tagged L1-caches | --- | 2021/09/08 01:28 PM |
Virtually tagged L1-caches | anonymou5 | 2021/09/08 07:28 PM |
Virtually tagged L1-caches | anonymou5 | 2021/09/08 07:34 PM |
Virtually tagged L1-caches | --- | 2021/09/09 09:14 AM |
Virtually tagged L1-caches | anonymou5 | 2021/09/09 09:44 PM |
Multi-threading? | David Kanter | 2021/09/09 08:32 PM |
Multi-threading? | --- | 2021/09/10 08:19 AM |
Virtually tagged L1-caches | sr | 2021/09/11 12:19 AM |
Virtually tagged L1-caches | sr | 2021/09/11 12:36 AM |
Virtually tagged L1-caches | --- | 2021/09/11 08:53 AM |
Virtually tagged L1-caches | sr | 2021/09/11 11:43 PM |
Virtually tagged L1-caches | Linus Torvalds | 2021/09/12 10:10 AM |
Virtually tagged L1-caches | sr | 2021/09/12 10:57 AM |
Virtually tagged L1-caches | dmcq | 2021/09/13 07:31 AM |
Virtually tagged L1-caches | sr | 2021/09/20 03:11 AM |
Virtually tagged L1-caches | sr | 2021/09/11 01:49 AM |
Virtually tagged L1-caches | Linus Torvalds | 2021/09/08 11:34 AM |
Virtually tagged L1-caches | dmcq | 2021/09/09 01:46 AM |
Virtually tagged L1-caches | dmcq | 2021/09/09 01:58 AM |
Virtually tagged L1-caches | sr | 2021/09/11 12:29 AM |
Virtually tagged L1-caches | dmcq | 2021/09/11 07:59 AM |
Virtually tagged L1-caches | sr | 2021/09/11 11:57 PM |
Virtually tagged L1-caches | dmcq | 2021/09/12 07:44 AM |
Virtually tagged L1-caches | sr | 2021/09/12 08:48 AM |
Virtually tagged L1-caches | dmcq | 2021/09/12 12:22 PM |
Virtually tagged L1-caches | sr | 2021/09/20 03:40 AM |
Where do you see this information? (NT) | anon2 | 2021/09/09 01:45 AM |
Where do you see this information? | sr | 2021/09/11 12:40 AM |
Where do you see this information? | anon2 | 2021/09/11 12:53 AM |
Where do you see this information? | sr | 2021/09/11 01:08 AM |
Thank you (NT) | anon2 | 2021/09/11 03:31 PM |