By: , August 25, 2013 8:20 am
Room: Moderated Discussions
rwessel (robertwessel.delete@this.yahoo.com) on August 23, 2013 10:58 pm wrote:
> Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on August 22, 2013 4:52 pm wrote:
> > rwessel (robertwessel.delete@this.yahoo.com) on August 20, 2013 12:08 am wrote:
> > > Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on August 19, 2013 3:28 pm wrote:
> > > >
> > > > Thanks, this reply definitely clears up quite a few loose ends!
> > > >
> > > > - So to my understanding, this "special register" simply lists the physical address of the base
> > > > of the array of the process currently being worked on. Correct? (I assume the physical address of
> > > > the base would be needed so that things can be "scaled" from the beginning of the array up?)
> > >
> > >
> > > Typically the special register (CR3 on x86, for example) simply contains the physical address of the root
> > > of the page table structure. On x86 that's the first entry
> > > of the page directory. In the illustration above
> > > (for x86 in 32-bit, non-PAE mode), the page directory contains
> > > 1024, four byte, entries. Each of those entries
> > > can be marked invalid, in which case the 4MB region they
> > > would map is not valid, or is marked valid, in which
> > > case the PDE points to a page table, which also has 1024, four byte, entries, each one valid or invalid.
> > > Valid ones contain a translation to a physical page address, the invalid ones don't.
> > >
> > >
> > > > - You definitely gave quite an informative chunk about page tables; but I don't believe I figured
> > > > out the answer to one of my questions: If the LSU compares the TLB entries to the first line of the
> > > > L1 cache and does NOT find a match; where does it go from there? Does it go to main memory and use
> > > > the page tables to locate the needed information, or does it scale from L1 to L2 to L3, etc.?
> > >
> > >
> > > The LSU does not compare TLB entries to anything. After the TLB is used to translate a virtual address
> > > to a physical address, the physical address is used to look up the requested cache line in the L1 cache.
> > > If it's not there, the (same) physical address is used to check the L2, then the L3, L4 (if present),
> > > etc., and ultimately main memory. After the translation happens (once), it doesn't happen again.
> >
> > Thanks and sorry for the late replies; I don't mean to make it a habit, work
> > has just been eating up a lot of my time as the summer comes to a close.
> >
> > - So the "first entry" of the root of the page table can be variable depending on what addresses
> > are invalid, yes? So if entries 1-61 are invalid, then the route would be 62, and the special
> > register would keep this physical address documented in the special registers; yes?
>
>
> Not on x86. Assuming the two level page table format, each four byte entry in the page
> directory covers 4MB. If the first 244MB (61 entries in the PD) are not to be mapped,
> then those 61 entries will be marked invalid, and they will not point to page tables*.
>
> The Page Directories and Page Tables the PD points to are each a single page (4KB) in size. Some other
> architectures have allowed some variability in the size of the structures, but x86 really does not.
>
>
> *Actually they could, but then you'd need each page table to
> have 1024 four byte entries marked invalid for each page.
>
>
> > - I mean to say in the case that there is a TLB miss. In that
> > case, would it scale up chronologically? L1 to L2 to L3, etc?
>
>
> If there's a TLB miss, the page table walker (whatever mechanism that might be), will step through the page
> tables (in whatever format those might be), to find the actual translation. On most architectures those
> accesses are fairly ordinary memory accesses, and thus can be cached in the ordinary cache hierarchy.
>
> On x86, with a two level page table, let's say CR3=0x12345000 (thus defining where the page directory is)
> and the virtual address we want to look up is 0x6789abcd. That's split into three components - the 0xbcd
> is the offset in the page we want to access (and thus does not require translation). The upper part is split
> in half*, with 0x19e being the index into the page directory, and 0x09a being the index into the page table.
> So the CPU will read the four byte page directory entry at address 0x12345678 (0x12345000+4*0x19e). Let's
> say that entry is marked valid, and points to a page table at 0xaaaaa000. The second index is then used
> to read the page table entry at 0xaaaaa268 (0xaaaaa000+4*0x09a). That page table entry, if marked valid,
> contains that address of the actual physical page mapped to that virtual address. Let's say that's 0xeeeee000,
> then the final memory access would then be performed to 0xeeeeebcd (0xeeeee000+0xbcd).
>
>
> *0x6789a = 0110 0111 1000 1001 1010, split in half that's 01 1001 1110 and 00 1001 1010, or 0x19e and 0x09a
>
Thank you very much again for explaining these things to me. Though unfortunately; a lot of it went over my head. I don't know exactly why, but the second part of it seemed to be fairly above my knowledge. Maybe it's because I don't know exactly how these addresses are formed and what each part is. To be honest, I don't even know what an "offset" is.
- Though basically, everytime there is a TLB miss, the page table walker will have to go straight to RAM, obtain the virtual address (or physical), and trace back from L1 to L2 to L3 trying to identify the entry using the address the page table provided?
- So the "root" is not the root of the entire memory hiearchy, but the root of a page in RAM?
Thanks again and sorry for losing understanding in things.
> Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on August 22, 2013 4:52 pm wrote:
> > rwessel (robertwessel.delete@this.yahoo.com) on August 20, 2013 12:08 am wrote:
> > > Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on August 19, 2013 3:28 pm wrote:
> > > >
> > > > Thanks, this reply definitely clears up quite a few loose ends!
> > > >
> > > > - So to my understanding, this "special register" simply lists the physical address of the base
> > > > of the array of the process currently being worked on. Correct? (I assume the physical address of
> > > > the base would be needed so that things can be "scaled" from the beginning of the array up?)
> > >
> > >
> > > Typically the special register (CR3 on x86, for example) simply contains the physical address of the root
> > > of the page table structure. On x86 that's the first entry
> > > of the page directory. In the illustration above
> > > (for x86 in 32-bit, non-PAE mode), the page directory contains
> > > 1024, four byte, entries. Each of those entries
> > > can be marked invalid, in which case the 4MB region they
> > > would map is not valid, or is marked valid, in which
> > > case the PDE points to a page table, which also has 1024, four byte, entries, each one valid or invalid.
> > > Valid ones contain a translation to a physical page address, the invalid ones don't.
> > >
> > >
> > > > - You definitely gave quite an informative chunk about page tables; but I don't believe I figured
> > > > out the answer to one of my questions: If the LSU compares the TLB entries to the first line of the
> > > > L1 cache and does NOT find a match; where does it go from there? Does it go to main memory and use
> > > > the page tables to locate the needed information, or does it scale from L1 to L2 to L3, etc.?
> > >
> > >
> > > The LSU does not compare TLB entries to anything. After the TLB is used to translate a virtual address
> > > to a physical address, the physical address is used to look up the requested cache line in the L1 cache.
> > > If it's not there, the (same) physical address is used to check the L2, then the L3, L4 (if present),
> > > etc., and ultimately main memory. After the translation happens (once), it doesn't happen again.
> >
> > Thanks and sorry for the late replies; I don't mean to make it a habit, work
> > has just been eating up a lot of my time as the summer comes to a close.
> >
> > - So the "first entry" of the root of the page table can be variable depending on what addresses
> > are invalid, yes? So if entries 1-61 are invalid, then the route would be 62, and the special
> > register would keep this physical address documented in the special registers; yes?
>
>
> Not on x86. Assuming the two level page table format, each four byte entry in the page
> directory covers 4MB. If the first 244MB (61 entries in the PD) are not to be mapped,
> then those 61 entries will be marked invalid, and they will not point to page tables*.
>
> The Page Directories and Page Tables the PD points to are each a single page (4KB) in size. Some other
> architectures have allowed some variability in the size of the structures, but x86 really does not.
>
>
> *Actually they could, but then you'd need each page table to
> have 1024 four byte entries marked invalid for each page.
>
>
> > - I mean to say in the case that there is a TLB miss. In that
> > case, would it scale up chronologically? L1 to L2 to L3, etc?
>
>
> If there's a TLB miss, the page table walker (whatever mechanism that might be), will step through the page
> tables (in whatever format those might be), to find the actual translation. On most architectures those
> accesses are fairly ordinary memory accesses, and thus can be cached in the ordinary cache hierarchy.
>
> On x86, with a two level page table, let's say CR3=0x12345000 (thus defining where the page directory is)
> and the virtual address we want to look up is 0x6789abcd. That's split into three components - the 0xbcd
> is the offset in the page we want to access (and thus does not require translation). The upper part is split
> in half*, with 0x19e being the index into the page directory, and 0x09a being the index into the page table.
> So the CPU will read the four byte page directory entry at address 0x12345678 (0x12345000+4*0x19e). Let's
> say that entry is marked valid, and points to a page table at 0xaaaaa000. The second index is then used
> to read the page table entry at 0xaaaaa268 (0xaaaaa000+4*0x09a). That page table entry, if marked valid,
> contains that address of the actual physical page mapped to that virtual address. Let's say that's 0xeeeee000,
> then the final memory access would then be performed to 0xeeeeebcd (0xeeeee000+0xbcd).
>
>
> *0x6789a = 0110 0111 1000 1001 1010, split in half that's 01 1001 1110 and 00 1001 1010, or 0x19e and 0x09a
>
Thank you very much again for explaining these things to me. Though unfortunately; a lot of it went over my head. I don't know exactly why, but the second part of it seemed to be fairly above my knowledge. Maybe it's because I don't know exactly how these addresses are formed and what each part is. To be honest, I don't even know what an "offset" is.
- Though basically, everytime there is a TLB miss, the page table walker will have to go straight to RAM, obtain the virtual address (or physical), and trace back from L1 to L2 to L3 trying to identify the entry using the address the page table provided?
- So the "root" is not the root of the entire memory hiearchy, but the root of a page in RAM?
Thanks again and sorry for losing understanding in things.