Or use a PLB

By: Heikki Kultala (heikki.NOSPAM.kultala.delete@this.gmail.com), September 28, 2021 2:53 am
Room: Moderated Discussions
⚛ (0xe2.0x9a.0x9b.delete@this.gmail.com) on September 27, 2021 7:37 am wrote:

> (I am writing this response without reading through the sub-tree of this forum node, so I would
> like to apologize in advance if I will write something that has been discussed there.)
> You didn't write something unexpected, which is good from a certain viewpoint because it makes it easier to
> attack with some counter-arguments. You are defending the "paging-unit is superior to segments" viewpoint.

There is no such thing as "paging unit". So you are having a quite huge straw man here.

> - Protected mode segments (80286 and later), with two privilege levels (kernel-space
> and user-space), are fully sufficient to implement a Unix-like operating system.

... how do you handle fork?

With really terrible performance overhead?

> - A protected mode segment has a base address and a limit (usually with a granularity
> of at most 1 word) that only the kernel-space can modify directly.
> - A downside of having a paging unit in a CPU is that the CPU must have L1+L2 caches (TLBs) storing virtual
> memory mappings (which is, in simplified terms, a hash-map implemented in hardware)
> and address-generation
> units (AGUs) which are separate from ALUs.

There is nothing forcing using separate AGU when paging is used.

Actually, you got this part totally backwads. With paging, single adder is enough for as both ALU and AGU. With segments, you always need at least two adders.

> In contrast to this, segments are much simpler: a linear GDT
> (global descriptor table, cached in L1D like any other data), adders, and comparators.

There is nothing really preventing caching page table entries in L1D or L2 cache like any other data. Though storing those page table entries into L1D does not make sense from performance point of view. But L2 is a very common place for page table entries.

And caching that GDT into L1D is not enough for good performance. You have to load the segment you use into segment register, which means bad overhead when accessing different segments.
(practically, every time you access a wild pointer)

> - Paging makes it more complicated to add more AGUs (in order to increase the number of loads&stores per
> cycle) compared to segment registers because the paging unit is logically a single shared resource

Totally wrong.

Each load/store unit (LSU) can have it's own TLBs. ​Practical implementation is that each LSU has it's own L1 TLB, and L2 TLB is then shared between all the LSUs of a core, and the L2 TLB is then fed from the L2 cache.

Or, each LSU can have it’s own L1 and L2 TLBs and L3 TLB is then shared between all LSUs of a core.

The page walker is practically always shared between LSUs, but there is actually nothing forcing it. It would just make zero sense to have multiple page walkers per core because they are used so rarely, because the TLBs work so well.

So, there is absolutely zero need to share anything between the different LSUs on a core with paged VM. Things are only shared to make things smaller and faster and the sharing restricts/limtis nothing.

>- while
> segment registers (registers+adders+comparators) are more distributed/parallel in their nature.

Totally wrong/opposite. There is a fixed number of segment registers in the architecture, limiting how many totally separate pointers we can use without very expensive reloading of those segment registers.

> - An instruction set architecture probably needs to support at least 8 segment registers in order to
> be future-proof and to make it a bit easier for the kernel and the user-space to manage memory.

The best part is no part. With paging, there is no need to put any this kind of limits into the architecture.

> - An issue on amd64/x86-64 is that it is impossible to disable the paging unit and to disable-or-repurpose
> AGUs so that, if you use just segments to implement an OS, you wouldn't need to pay the *implicit*
> tax associated with page translation.

What tax are you talking about? The “huge” tax of accessing an L0 TLB?

x86-64 supports page size of one gigabyte. With that page size, everything fits in the L0 TLBs.

> - Protected mode segments are fully sufficient to implement virtual memory, because exceeding
> the segment's limit is equivalent to a page fault. When a user-space instruction accesses
> memory beyond the segment's limit, the kernel can for example load missing data from a HDD/SSD
> drive, extend the limit of the segment in the global descriptor table (GDT on 80286+), and
> switch back to user-space which will re-execute the memory access instruction.

“sufficient” as possible, yes.

And this should be trivial to everybody. you are not adding anything to the discussion by saying it. There was an OS/2 for 286 launched in 1987.

What you don’t seem to understand that even though it’s POSSIBLE , it’s terribly inefficient and troublesome for the programmer.

> - Due to the linear nature of segments, segments are prone to memory fragmentation. This is *both*
> an issue and an advantage. One advantage for example is that, after memory defragmentation, related
> data is located closer together and related code is located closer together (in other words: segmentation
> can result in better L1D and L1I cache utilization compared to paging).

Again, exactly the other way. Page sizes are always much bigger than VM pages, so VM fragmentation has ZERO effect on cache utilization.

“After memory defragementation.”. So you are seriously proposing wasting huge amount of cycles and bandwidth defragmenting your memory. And then you consider it as an “advantage”.

It’s not.

> - Using protected mode segments to implement an OS requires a different style of thinking (different rules)
> than implementing an OS on top of paging. Going the way of segments means that the OS and user-space processes
> need to be more dynamic (i.e: somewhere between static compilation and just-in-time compilation).

This is just handwaving.

> - About pointers in the C programming language: C has a single pointer type. Nowadays, it tends
> to be either 32 or 64 bits in width. C doesn't support mixing 32-bit and 64-bit pointers in a single
> program because this would require the language to support two pointer types. Using '*' asterisk
> in the syntax of pointer types in C is, in hindsight, a clearly suboptimal programming language design
> decision both from run time performance viewpoint and from memory consumption viewpoint.

Totally wrong again.

World is full of C compilers which support or have supported multiple pointer sizes. And it’s huge pain in the ass for the programmer.

Programmers are very happy that now they do not HAVE to use those.

> - Use of segments to implement an OS (without paging) depends on whether the programming language
> used to implement user-space apps supports multiple pointer types. For example, without multiple
> pointer types it is more complicated to safely distinguish a pointer to a local function from
> a pointer to a shared library function, or a pointer to local data from a pointer to shared data
> (irrespective of whether the pointers have different widths or the same width).

No, it has nothing to do what programming language supports. Adding this kind of support is quite trivial and can be done to existing language quite easily, but NOBODY WANTS TO DO IT because it’s so TERRIBLE FOR THE PROGRAMMER.

> - If you would think about it more deeply and would not be resistant to thinking about large changes
> to an OS architecture, you wouldn't believe that (as you wrote) "C not wanting segments is not the cause,
> it's the correlation". You would believe the opposite: that programming languages with a single pointer
> type (which includes C) are among the primary causes of the downfall of segments.

Totally wrong again. The stupidity of segments wa sudnerstood WHEN PEOPLE HAD TO USE THOSE MULTIPLE POINTER TYPES BECAUSE OF SEGMENTS.

> Eventually, you might
> also start to believe that there is a causal link between C's type system and the maximum number of
> loads&stores a CPU core can perform per cycle assuming a particular transistor budget.
> - Your claim that "So what I am claiming is that segments [cut] are a bad idea. They add unwanted
> complexity that doesn't actually buy users anything at all into one of the most important part
> of the CPU pipeline - the memory units" - is a false statement. It is totally obvious that the
> hardware implementation of segments is simpler than the hardware implementation of paging,

Again, wrong. Your segment registers are always in the instruction set and add another adder in your memory address pipeline. This adder means extra source register read

Paging can be done with zero changes to existing use mode instruction set, and zero changes to existing user-mode software that assumes physical linear addressing.

The simplest paging MMU is a single TLB which just sends a page fault on a miss. For the point of view of the datapath, this is less complexity than the extra adder of segments. And, it also allows VIPT L1D cache, makes memory accesses that hit the cache faster.

I can run (multiple instances of) exactly same software on top of direct hardware with no memory protection, or with virtual machine on a paged virtual machine inside a multitasking OS.

> and
> it has a higher potential for concurrency inside of a CPU than paging.

Again, exactly the opposite.

< Previous Post in ThreadNext Post in Thread >
TopicPosted ByDate
POWER10 SAP SD benchmarkanon22021/09/06 02:36 PM
  POWER10 SAP SD benchmarkDaniel B2021/09/07 01:31 AM
    "Cores" (and SPEC)Rayla2021/09/07 06:51 AM
      "Cores" (and SPEC)anon2021/09/07 02:56 PM
  POWER10 SAP SD benchmarkAnon2021/09/07 02:24 PM
    POWER10 SAP SD benchmarkAnon2021/09/07 02:27 PM
  Virtually tagged L1-cachessr2021/09/08 04:49 AM
    Virtually tagged L1-cachesdmcq2021/09/08 07:22 AM
      Virtually tagged L1-cachessr2021/09/08 07:56 AM
      Virtually tagged L1-cachesHugo Décharnes2021/09/08 07:58 AM
        Virtually tagged L1-cachessr2021/09/08 09:09 AM
          Virtually tagged L1-cachesHugo Décharnes2021/09/08 09:46 AM
            Virtually tagged L1-cachessr2021/09/08 10:35 AM
              Virtually tagged L1-cachesHugo Décharnes2021/09/08 11:23 AM
                Virtually tagged L1-cachessr2021/09/08 11:40 AM
                  Virtually tagged L1-cachesanon2021/09/09 02:16 AM
                    Virtually tagged L1-cachesKonrad Schwarz2021/09/10 04:19 AM
                      Virtually tagged L1-cachesHugo Décharnes2021/09/10 05:59 AM
                        Virtually tagged L1-cachesanon2021/09/14 02:17 AM
                          Virtually tagged L1-cachesdmcq2021/09/14 08:34 AM
                            Or use a PLB (NT)Paul A. Clayton2021/09/14 08:45 AM
                              Or use a PLBLinus Torvalds2021/09/14 02:27 PM
                                Or use a PLBanon2021/09/14 11:15 PM
                                  Or use a PLBMichael S2021/09/15 02:21 AM
                                    Or use a PLBdmcq2021/09/15 02:42 PM
                                      Or use a PLBKonrad Schwarz2021/09/16 03:24 AM
                                        Or use a PLBMichael S2021/09/16 09:13 AM
                                          Or use a PLB---2021/09/16 12:02 PM
                                  PLB referencePaul A. Clayton2021/09/18 01:35 PM
                                    PLB referenceMichael S2021/09/18 03:14 PM
                                      Demand paging/translation orthogonalPaul A. Clayton2021/09/19 06:33 AM
                                        Demand paging/translation orthogonalMichael S2021/09/19 08:10 AM
                                      PLB referenceCarson2021/09/20 09:19 PM
                                    PLB referencesr2021/09/20 05:02 AM
                                      PLB referenceMichael S2021/09/20 06:03 AM
                                        PLB referenceLinus Torvalds2021/09/20 11:10 AM
                                  Or use a PLBsr2021/09/20 03:32 AM
                              Or use a PLBsr2021/09/21 08:36 AM
                                Or use a PLBLinus Torvalds2021/09/21 09:04 AM
                                  Or use a PLBsr2021/09/21 09:48 AM
                                    Or use a PLBLinus Torvalds2021/09/21 12:55 PM
                                      Or use a PLBsr2021/09/22 05:55 AM
                                        Or use a PLBrwessel2021/09/22 06:09 AM
                                        Or use a PLBLinus Torvalds2021/09/22 10:50 AM
                                          Or use a PLBsr2021/09/22 12:00 PM
                                            Or use a PLBdmcq2021/09/22 03:07 PM
                                            Or use a PLBEtienne Lorrain2021/09/23 07:50 AM
                                          Or use a PLBanon22021/09/22 03:09 PM
                                            Or use a PLBdmcq2021/09/23 01:35 AM
                                          Or use a PLB2021/09/23 08:37 AM
                                            Or use a PLBLinus Torvalds2021/09/23 11:01 AM
                                              Or use a PLBgpd2021/09/24 02:59 AM
                                                Or use a PLBLinus Torvalds2021/09/24 09:45 AM
                                                  Or use a PLBdmcq2021/09/24 11:43 AM
                                                  Or use a PLBsr2021/09/25 09:19 AM
                                                    Or use a PLBLinus Torvalds2021/09/25 09:44 AM
                                                      Or use a PLBsr2021/09/25 10:11 AM
                                                        Or use a PLBLinus Torvalds2021/09/25 10:31 AM
                                                          Or use a PLBsr2021/09/25 10:52 AM
                                                            Or use a PLBLinus Torvalds2021/09/25 11:05 AM
                                                              Or use a PLBsr2021/09/25 11:23 AM
                                                                Or use a PLBrwessel2021/09/25 02:29 PM
                                                                  Or use a PLBsr2021/09/30 11:22 PM
                                                                    Or use a PLBrwessel2021/10/01 05:19 AM
                                                                      Or use a PLBDavid Hess2021/10/01 09:35 AM
                                                                        Or use a PLBrwessel2021/10/02 03:47 AM
                                                                      Or use a PLBsr2021/10/02 10:16 AM
                                                                        Or use a PLBrwessel2021/10/02 10:53 AM
                                                          Or use a PLBLinus Torvalds2021/09/25 10:57 AM
                                                            Or use a PLBsr2021/09/25 11:07 AM
                                                              Or use a PLBLinus Torvalds2021/09/25 11:21 AM
                                                                Or use a PLBsr2021/09/25 11:40 AM
                                                                  Or use a PLBnksingh2021/09/27 08:07 AM
                                                          Or use a PLB2021/09/27 08:02 AM
                                                            Or use a PLBLinus Torvalds2021/09/27 09:20 AM
                                                              Or use a PLBLinus Torvalds2021/09/27 11:58 AM
                                                                Or use a PLBdmcq2021/09/28 09:59 AM
                                              Or use a PLBsr2021/09/25 09:34 AM
                                                Or use a PLBrwessel2021/09/25 02:44 PM
                                                  Or use a PLBsr2021/10/01 12:04 AM
                                                    Or use a PLBrwessel2021/10/01 05:33 AM
                                                      I386 segmentation highlightssr2021/10/04 06:53 AM
                                                        I386 segmentation highlightsAdrian2021/10/04 08:53 AM
                                                          I386 segmentation highlightssr2021/10/04 09:19 AM
                                                        I386 segmentation highlightsrwessel2021/10/04 03:57 PM
                                                          I386 segmentation highlightssr2021/10/05 10:16 AM
                                                            I386 segmentation highlightsMichael S2021/10/05 11:27 AM
                                                            I386 segmentation highlightsrwessel2021/10/05 03:20 PM
                                                Or use a PLBJohnG2021/09/25 09:18 PM
                                              Or use a PLB2021/09/27 06:37 AM
                                                Or use a PLBHeikki Kultala2021/09/28 02:53 AM
                                                  Or use a PLBrwessel2021/09/28 06:29 AM
                                        Or use a PLBDavid Hess2021/09/23 05:00 PM
                                          Or use a PLBAdrian2021/09/24 12:21 AM
                                            Or use a PLBdmcq2021/09/25 11:41 AM
                                        Or use a PLBblaine2021/09/26 10:19 PM
                                          Or use a PLBDavid Hess2021/09/27 10:35 AM
                                            Or use a PLBblaine2021/09/27 04:19 PM
                                            Or use a PLBAdrian2021/09/27 09:40 PM
                                              Or use a PLBAdrian2021/09/27 09:59 PM
                                                Or use a PLBdmcq2021/09/28 06:45 AM
                                              Or use a PLBrwessel2021/09/28 06:45 AM
                                              Or use a PLBDavid Hess2021/09/28 11:50 AM
                                                Or use a PLBEtienne Lorrain2021/09/30 12:25 AM
                                                  Or use a PLBDavid Hess2021/10/01 09:40 AM
                                  MMU privilegessr2021/09/21 10:07 AM
                                    MMU privilegesLinus Torvalds2021/09/21 12:49 PM
                            Virtually tagged L1-cachesKonrad Schwarz2021/09/16 03:18 AM
                          Virtually tagged L1-cachesCarson2021/09/16 12:12 PM
                            Virtually tagged L1-cachesanon22021/09/16 04:16 PM
                              Virtually tagged L1-cachesrwessel2021/09/16 05:29 PM
                          Virtually tagged L1-cachessr2021/09/20 03:20 AM
              Virtually tagged L1-caches---2021/09/08 01:28 PM
                Virtually tagged L1-cachesanonymou52021/09/08 07:28 PM
                  Virtually tagged L1-cachesanonymou52021/09/08 07:34 PM
                  Virtually tagged L1-caches---2021/09/09 09:14 AM
                    Virtually tagged L1-cachesanonymou52021/09/09 09:44 PM
                Multi-threading?David Kanter2021/09/09 08:32 PM
                  Multi-threading?---2021/09/10 08:19 AM
                Virtually tagged L1-cachessr2021/09/11 12:19 AM
                Virtually tagged L1-cachessr2021/09/11 12:36 AM
                  Virtually tagged L1-caches---2021/09/11 08:53 AM
                    Virtually tagged L1-cachessr2021/09/11 11:43 PM
                      Virtually tagged L1-cachesLinus Torvalds2021/09/12 10:10 AM
                        Virtually tagged L1-cachessr2021/09/12 10:57 AM
                          Virtually tagged L1-cachesdmcq2021/09/13 07:31 AM
                            Virtually tagged L1-cachessr2021/09/20 03:11 AM
            Virtually tagged L1-cachessr2021/09/11 01:49 AM
      Virtually tagged L1-cachesLinus Torvalds2021/09/08 11:34 AM
        Virtually tagged L1-cachesdmcq2021/09/09 01:46 AM
          Virtually tagged L1-cachesdmcq2021/09/09 01:58 AM
          Virtually tagged L1-cachessr2021/09/11 12:29 AM
            Virtually tagged L1-cachesdmcq2021/09/11 07:59 AM
              Virtually tagged L1-cachessr2021/09/11 11:57 PM
                Virtually tagged L1-cachesdmcq2021/09/12 07:44 AM
                  Virtually tagged L1-cachessr2021/09/12 08:48 AM
                    Virtually tagged L1-cachesdmcq2021/09/12 12:22 PM
                      Virtually tagged L1-cachessr2021/09/20 03:40 AM
    Where do you see this information? (NT)anon22021/09/09 01:45 AM
      Where do you see this information?sr2021/09/11 12:40 AM
        Where do you see this information?anon22021/09/11 12:53 AM
          Where do you see this information?sr2021/09/11 01:08 AM
            Thank you (NT)anon22021/09/11 03:31 PM
Reply to this Topic
Body: No Text
How do you spell tangerine? 🍊