By: wumpus (wumpus.delete@this.lost.in.a.hole), May 5, 2021 8:06 am
Room: Moderated Discussions
anon2 (anon.delete@this.anon.com) on May 3, 2021 1:17 am wrote:
> Ben LaHaise (bcrl.delete@this.kvack.org) on May 2, 2021 10:45 am wrote:
> > Yuhong Bao (yuhongbao_386.delete@this.hotmail.com) on May 1, 2021 1:01 pm wrote:
> > > The fun thing is that 4K pages probably used to be too large. On a 80386, just 8 tasks would consume
> > > at least 64k and probably 128k just for the page tables alone. (80386 page tables were two levels)
> >
> > The National Semiconductor 32016 had 512 byte page sizes. The problem is that overhead of small
> > page sizes becomes excessive as soon as you have more than a couple of megabytes of memory.
> > With 16MB of RAM and 512 byte pages that works out to 65536 pages for which data structures
> > to track all the individual pages are needed.
>
> That's what Linux does, but it is not necessarily the best size/speed tradeoff for very
> small memory systems. Tracking per page data with 4k pages and 64 bytes per page is
> still 25% the overhead of your "unviable" solution, which doesn't sound great when you
> put it that way. That being said I don't think it's necessarily bad at all.
>
>
Assuming you still want to keep "L1 "way size" = page size, that gives you 8 cachelines per "way". I think once ARM made a 32-way L1 cache that they claimed it was faster as 32-way, but that was certainly the exception.
Do you want "60-way" caches? Add some sort of inital TLB lookup to the L1 latency (which of course would require more entries, because smaller pages)?
It might have worked for the 386, but I think the more advanced RISCs and P6 might have had a lot of difficulty with L1 cache design and fast TLB lookup. And things would only get worse. And I'm really guessing that HDD transfer rates were the real reason for the 4k size.
Even modern disk drives use 4k pages, although I'm sure that has more to do with newer ECC algorithms (and the need for their efficiency) than any underlying preference for 4k pages.
> Ben LaHaise (bcrl.delete@this.kvack.org) on May 2, 2021 10:45 am wrote:
> > Yuhong Bao (yuhongbao_386.delete@this.hotmail.com) on May 1, 2021 1:01 pm wrote:
> > > The fun thing is that 4K pages probably used to be too large. On a 80386, just 8 tasks would consume
> > > at least 64k and probably 128k just for the page tables alone. (80386 page tables were two levels)
> >
> > The National Semiconductor 32016 had 512 byte page sizes. The problem is that overhead of small
> > page sizes becomes excessive as soon as you have more than a couple of megabytes of memory.
> > With 16MB of RAM and 512 byte pages that works out to 65536 pages for which data structures
> > to track all the individual pages are needed.
>
> That's what Linux does, but it is not necessarily the best size/speed tradeoff for very
> small memory systems. Tracking per page data with 4k pages and 64 bytes per page is
> still 25% the overhead of your "unviable" solution, which doesn't sound great when you
> put it that way. That being said I don't think it's necessarily bad at all.
>
>
Assuming you still want to keep "L1 "way size" = page size, that gives you 8 cachelines per "way". I think once ARM made a 32-way L1 cache that they claimed it was faster as 32-way, but that was certainly the exception.
Do you want "60-way" caches? Add some sort of inital TLB lookup to the L1 latency (which of course would require more entries, because smaller pages)?
It might have worked for the 386, but I think the more advanced RISCs and P6 might have had a lot of difficulty with L1 cache design and fast TLB lookup. And things would only get worse. And I'm really guessing that HDD transfer rates were the real reason for the 4k size.
Even modern disk drives use 4k pages, although I'm sure that has more to do with newer ECC algorithms (and the need for their efficiency) than any underlying preference for 4k pages.