By: sr (nobody.delete@this.nowhere.com), September 11, 2021 2:49 am
Room: Moderated Discussions
Hugo Décharnes (hdecharn.delete@this.outlook.fr) on September 8, 2021 10:46 am wrote:
> BTW, L2 access time is critical, and TLB coverage (not accuracy) does not compensate, in that most
> workload are bounded by the cache latency, not the TLB coverage. Yes, VIVPT enables to move TLB away
> from the time-critical load-return path, enabling bigger TLBs. But you must reduce the L2 hit latency
> as much as possible. We are talking about percents of performance for a typical big core.
But you got that if TLB is done late like in VIVPT there's no TLB result available when L1 miss is detected?
And from IBM slides we could guess that power10 has VIPT-L2 cache with hashed linear-address space microtags. So they can start accessing L2 and doing TLB-conversion same time, and because L2 access takes ages they can pipeline TLB to very long. (for 2MB L2 having double tagging would make L2 area like 10% bigger which isn't insignificant anymore)
What I meant about TLB accuracy that for given size they can use better hit-rate schemes if they have enough power budged or time to pipeline TLB access. So in case of 64 slot TLB they could use pipelined fully associative solution instead of 4-way associative with recent use way prediction - basically direct mapped TLB like Intel have been using in some it's older models.
For IBM z-mainframe CPU they documented that they use pipelined TLB for L2 and beyond. Pipelining TLB access makes fully associative scheme possible, instead of doing 64 36(for x86-64) bit comparisons in one clock cycle they can compare few bits or even one bit a time and remove missed ways from next comparison cycle. It even might be the most power efficient way to do L1 TLB, decreasing associativity might reduce hit rate so much that it decrease overall TLB efficiency vs doing it fully associativite.
> BTW, L2 access time is critical, and TLB coverage (not accuracy) does not compensate, in that most
> workload are bounded by the cache latency, not the TLB coverage. Yes, VIVPT enables to move TLB away
> from the time-critical load-return path, enabling bigger TLBs. But you must reduce the L2 hit latency
> as much as possible. We are talking about percents of performance for a typical big core.
But you got that if TLB is done late like in VIVPT there's no TLB result available when L1 miss is detected?
And from IBM slides we could guess that power10 has VIPT-L2 cache with hashed linear-address space microtags. So they can start accessing L2 and doing TLB-conversion same time, and because L2 access takes ages they can pipeline TLB to very long. (for 2MB L2 having double tagging would make L2 area like 10% bigger which isn't insignificant anymore)
What I meant about TLB accuracy that for given size they can use better hit-rate schemes if they have enough power budged or time to pipeline TLB access. So in case of 64 slot TLB they could use pipelined fully associative solution instead of 4-way associative with recent use way prediction - basically direct mapped TLB like Intel have been using in some it's older models.
For IBM z-mainframe CPU they documented that they use pipelined TLB for L2 and beyond. Pipelining TLB access makes fully associative scheme possible, instead of doing 64 36(for x86-64) bit comparisons in one clock cycle they can compare few bits or even one bit a time and remove missed ways from next comparison cycle. It even might be the most power efficient way to do L1 TLB, decreasing associativity might reduce hit rate so much that it decrease overall TLB efficiency vs doing it fully associativite.
Topic | Posted By | Date |
---|---|---|
POWER10 SAP SD benchmark | anon2 | 2021/09/06 03:36 PM |
POWER10 SAP SD benchmark | Daniel B | 2021/09/07 02:31 AM |
"Cores" (and SPEC) | Rayla | 2021/09/07 07:51 AM |
"Cores" (and SPEC) | anon | 2021/09/07 03:56 PM |
POWER10 SAP SD benchmark | Anon | 2021/09/07 03:24 PM |
POWER10 SAP SD benchmark | Anon | 2021/09/07 03:27 PM |
Virtually tagged L1-caches | sr | 2021/09/08 05:49 AM |
Virtually tagged L1-caches | dmcq | 2021/09/08 08:22 AM |
Virtually tagged L1-caches | sr | 2021/09/08 08:56 AM |
Virtually tagged L1-caches | Hugo Décharnes | 2021/09/08 08:58 AM |
Virtually tagged L1-caches | sr | 2021/09/08 10:09 AM |
Virtually tagged L1-caches | Hugo Décharnes | 2021/09/08 10:46 AM |
Virtually tagged L1-caches | sr | 2021/09/08 11:35 AM |
Virtually tagged L1-caches | Hugo Décharnes | 2021/09/08 12:23 PM |
Virtually tagged L1-caches | sr | 2021/09/08 12:40 PM |
Virtually tagged L1-caches | anon | 2021/09/09 03:16 AM |
Virtually tagged L1-caches | Konrad Schwarz | 2021/09/10 05:19 AM |
Virtually tagged L1-caches | Hugo Décharnes | 2021/09/10 06:59 AM |
Virtually tagged L1-caches | anon | 2021/09/14 03:17 AM |
Virtually tagged L1-caches | dmcq | 2021/09/14 09:34 AM |
Or use a PLB (NT) | Paul A. Clayton | 2021/09/14 09:45 AM |
Or use a PLB | Linus Torvalds | 2021/09/14 03:27 PM |
Or use a PLB | anon | 2021/09/15 12:15 AM |
Or use a PLB | Michael S | 2021/09/15 03:21 AM |
Or use a PLB | dmcq | 2021/09/15 03:42 PM |
Or use a PLB | Konrad Schwarz | 2021/09/16 04:24 AM |
Or use a PLB | Michael S | 2021/09/16 10:13 AM |
Or use a PLB | --- | 2021/09/16 01:02 PM |
PLB reference | Paul A. Clayton | 2021/09/18 02:35 PM |
PLB reference | Michael S | 2021/09/18 04:14 PM |
Demand paging/translation orthogonal | Paul A. Clayton | 2021/09/19 07:33 AM |
Demand paging/translation orthogonal | Michael S | 2021/09/19 09:10 AM |
PLB reference | Carson | 2021/09/20 10:19 PM |
PLB reference | sr | 2021/09/20 06:02 AM |
PLB reference | Michael S | 2021/09/20 07:03 AM |
PLB reference | Linus Torvalds | 2021/09/20 12:10 PM |
Or use a PLB | sr | 2021/09/20 04:32 AM |
Or use a PLB | sr | 2021/09/21 09:36 AM |
Or use a PLB | Linus Torvalds | 2021/09/21 10:04 AM |
Or use a PLB | sr | 2021/09/21 10:48 AM |
Or use a PLB | Linus Torvalds | 2021/09/21 01:55 PM |
Or use a PLB | sr | 2021/09/22 06:55 AM |
Or use a PLB | rwessel | 2021/09/22 07:09 AM |
Or use a PLB | Linus Torvalds | 2021/09/22 11:50 AM |
Or use a PLB | sr | 2021/09/22 01:00 PM |
Or use a PLB | dmcq | 2021/09/22 04:07 PM |
Or use a PLB | Etienne Lorrain | 2021/09/23 08:50 AM |
Or use a PLB | anon2 | 2021/09/22 04:09 PM |
Or use a PLB | dmcq | 2021/09/23 02:35 AM |
Or use a PLB | ⚛ | 2021/09/23 09:37 AM |
Or use a PLB | Linus Torvalds | 2021/09/23 12:01 PM |
Or use a PLB | gpd | 2021/09/24 03:59 AM |
Or use a PLB | Linus Torvalds | 2021/09/24 10:45 AM |
Or use a PLB | dmcq | 2021/09/24 12:43 PM |
Or use a PLB | sr | 2021/09/25 10:19 AM |
Or use a PLB | Linus Torvalds | 2021/09/25 10:44 AM |
Or use a PLB | sr | 2021/09/25 11:11 AM |
Or use a PLB | Linus Torvalds | 2021/09/25 11:31 AM |
Or use a PLB | sr | 2021/09/25 11:52 AM |
Or use a PLB | Linus Torvalds | 2021/09/25 12:05 PM |
Or use a PLB | sr | 2021/09/25 12:23 PM |
Or use a PLB | rwessel | 2021/09/25 03:29 PM |
Or use a PLB | sr | 2021/10/01 12:22 AM |
Or use a PLB | rwessel | 2021/10/01 06:19 AM |
Or use a PLB | David Hess | 2021/10/01 10:35 AM |
Or use a PLB | rwessel | 2021/10/02 04:47 AM |
Or use a PLB | sr | 2021/10/02 11:16 AM |
Or use a PLB | rwessel | 2021/10/02 11:53 AM |
Or use a PLB | Linus Torvalds | 2021/09/25 11:57 AM |
Or use a PLB | sr | 2021/09/25 12:07 PM |
Or use a PLB | Linus Torvalds | 2021/09/25 12:21 PM |
Or use a PLB | sr | 2021/09/25 12:40 PM |
Or use a PLB | nksingh | 2021/09/27 09:07 AM |
Or use a PLB | ⚛ | 2021/09/27 09:02 AM |
Or use a PLB | Linus Torvalds | 2021/09/27 10:20 AM |
Or use a PLB | Linus Torvalds | 2021/09/27 12:58 PM |
Or use a PLB | dmcq | 2021/09/28 10:59 AM |
Or use a PLB | sr | 2021/09/25 10:34 AM |
Or use a PLB | rwessel | 2021/09/25 03:44 PM |
Or use a PLB | sr | 2021/10/01 01:04 AM |
Or use a PLB | rwessel | 2021/10/01 06:33 AM |
I386 segmentation highlights | sr | 2021/10/04 07:53 AM |
I386 segmentation highlights | Adrian | 2021/10/04 09:53 AM |
I386 segmentation highlights | sr | 2021/10/04 10:19 AM |
I386 segmentation highlights | rwessel | 2021/10/04 04:57 PM |
I386 segmentation highlights | sr | 2021/10/05 11:16 AM |
I386 segmentation highlights | Michael S | 2021/10/05 12:27 PM |
I386 segmentation highlights | rwessel | 2021/10/05 04:20 PM |
Or use a PLB | JohnG | 2021/09/25 10:18 PM |
Or use a PLB | ⚛ | 2021/09/27 07:37 AM |
Or use a PLB | Heikki Kultala | 2021/09/28 03:53 AM |
Or use a PLB | rwessel | 2021/09/28 07:29 AM |
Or use a PLB | David Hess | 2021/09/23 06:00 PM |
Or use a PLB | Adrian | 2021/09/24 01:21 AM |
Or use a PLB | dmcq | 2021/09/25 12:41 PM |
Or use a PLB | blaine | 2021/09/26 11:19 PM |
Or use a PLB | David Hess | 2021/09/27 11:35 AM |
Or use a PLB | blaine | 2021/09/27 05:19 PM |
Or use a PLB | Adrian | 2021/09/27 10:40 PM |
Or use a PLB | Adrian | 2021/09/27 10:59 PM |
Or use a PLB | dmcq | 2021/09/28 07:45 AM |
Or use a PLB | rwessel | 2021/09/28 07:45 AM |
Or use a PLB | David Hess | 2021/09/28 12:50 PM |
Or use a PLB | Etienne Lorrain | 2021/09/30 01:25 AM |
Or use a PLB | David Hess | 2021/10/01 10:40 AM |
MMU privileges | sr | 2021/09/21 11:07 AM |
MMU privileges | Linus Torvalds | 2021/09/21 01:49 PM |
Virtually tagged L1-caches | Konrad Schwarz | 2021/09/16 04:18 AM |
Virtually tagged L1-caches | Carson | 2021/09/16 01:12 PM |
Virtually tagged L1-caches | anon2 | 2021/09/16 05:16 PM |
Virtually tagged L1-caches | rwessel | 2021/09/16 06:29 PM |
Virtually tagged L1-caches | sr | 2021/09/20 04:20 AM |
Virtually tagged L1-caches | --- | 2021/09/08 02:28 PM |
Virtually tagged L1-caches | anonymou5 | 2021/09/08 08:28 PM |
Virtually tagged L1-caches | anonymou5 | 2021/09/08 08:34 PM |
Virtually tagged L1-caches | --- | 2021/09/09 10:14 AM |
Virtually tagged L1-caches | anonymou5 | 2021/09/09 10:44 PM |
Multi-threading? | David Kanter | 2021/09/09 09:32 PM |
Multi-threading? | --- | 2021/09/10 09:19 AM |
Virtually tagged L1-caches | sr | 2021/09/11 01:19 AM |
Virtually tagged L1-caches | sr | 2021/09/11 01:36 AM |
Virtually tagged L1-caches | --- | 2021/09/11 09:53 AM |
Virtually tagged L1-caches | sr | 2021/09/12 12:43 AM |
Virtually tagged L1-caches | Linus Torvalds | 2021/09/12 11:10 AM |
Virtually tagged L1-caches | sr | 2021/09/12 11:57 AM |
Virtually tagged L1-caches | dmcq | 2021/09/13 08:31 AM |
Virtually tagged L1-caches | sr | 2021/09/20 04:11 AM |
Virtually tagged L1-caches | sr | 2021/09/11 02:49 AM |
Virtually tagged L1-caches | Linus Torvalds | 2021/09/08 12:34 PM |
Virtually tagged L1-caches | dmcq | 2021/09/09 02:46 AM |
Virtually tagged L1-caches | dmcq | 2021/09/09 02:58 AM |
Virtually tagged L1-caches | sr | 2021/09/11 01:29 AM |
Virtually tagged L1-caches | dmcq | 2021/09/11 08:59 AM |
Virtually tagged L1-caches | sr | 2021/09/12 12:57 AM |
Virtually tagged L1-caches | dmcq | 2021/09/12 08:44 AM |
Virtually tagged L1-caches | sr | 2021/09/12 09:48 AM |
Virtually tagged L1-caches | dmcq | 2021/09/12 01:22 PM |
Virtually tagged L1-caches | sr | 2021/09/20 04:40 AM |
Where do you see this information? (NT) | anon2 | 2021/09/09 02:45 AM |
Where do you see this information? | sr | 2021/09/11 01:40 AM |
Where do you see this information? | anon2 | 2021/09/11 01:53 AM |
Where do you see this information? | sr | 2021/09/11 02:08 AM |
Thank you (NT) | anon2 | 2021/09/11 04:31 PM |