By: Stubabe (Stubabe.delete@this.nospam.com), August 21, 2013 4:22 pm
Room: Moderated Discussions
Symmetry (someone.delete@this.somewhere.com) on August 21, 2013 10:51 am wrote:
> Stubabe (Stubabe.delete@this.nospam.com) on August 21, 2013 9:43 am wrote:
> > Interprocess sharing
> > is slow full stop due to the hardware cost of changing pageing context. Even if it wasn't how exactly
> > is breaking up a contiguous dataset into process sized chunks and then having to implement some scheme
> > of tracking and accessing those now non-contagious sets an advantage EVER??? For unrelated tasks you
> > already had PAE extensions but they didn't help the supervisor address space limits.
>
> I think you're right and ⚛ is wrong, but in certain cases there could be advantages to doing that. The
> transition from 32 bit pointers to 64 bit pointers almost always involves a loss of performance through
> increased memory pressure, the only reason it didn't with x86 is all the other improvements it make to the
> architecture. There are actually people who've put some effort into using 32 bit pointers in 64 bit mode
> for that reason. This mostly only makes sense if you fit into 32 bits of memory, but I'd guess that there
> might be some rare exceptions. Cases where the processes you're breaking up into communicate seldom enough
> that the overhead of that isn't excessive, and where pointer induced cache pressure is large enough. Crazy
> to design the hardware around that use case, but I expect that it does actually exist.
>
Well if you only using < 2Gb why are you porting to 64bit anyway? OK some OSs only support 64bit userspace on a 64bit kernel but that is an OS limitation not a limitation of x86-64. In any case, I have personally implemented that very 32bit ptr optimisation (but mainly as work around for early Athlons not supporting CMPXCHG16B) on at least one occasion so you are preaching to the converted. But don't forget there are far more than just context switch overheads with multi-process approaches v multi-threading, there is a lot of OS memory/kernel context involved with creating a process too. Of course there are some RPC type programming models where multiprocess is a natural fit but that still doesn't mean those individual processes may not benefit from larger address spaces.
I have also used AWE and would generally some it up as slow, cumbersome, comes with gotchas such as physically locked non-paged commits (to simplify OS implementation) which often means elevated privileges are needed. Not nice... And yes DBs often exploit AWE (because it is still far better than hitting disk) but even then native 64bit implementations usually scale (and perform?) better not to mention that kernel address space starts becoming an issue when you start filling it up with PTEs on large PAE systems.
So yes while there are some narrow use cases where these schemes work well I could easily list quite a few where they really don't. In any case, do we all really want to be implementing IPC or swapping managers in our code just to avoid 64bit ptrs? I'll take a bit of ptr size hacking over that any day having done a bit of both.
> Stubabe (Stubabe.delete@this.nospam.com) on August 21, 2013 9:43 am wrote:
> > Interprocess sharing
> > is slow full stop due to the hardware cost of changing pageing context. Even if it wasn't how exactly
> > is breaking up a contiguous dataset into process sized chunks and then having to implement some scheme
> > of tracking and accessing those now non-contagious sets an advantage EVER??? For unrelated tasks you
> > already had PAE extensions but they didn't help the supervisor address space limits.
>
> I think you're right and ⚛ is wrong, but in certain cases there could be advantages to doing that. The
> transition from 32 bit pointers to 64 bit pointers almost always involves a loss of performance through
> increased memory pressure, the only reason it didn't with x86 is all the other improvements it make to the
> architecture. There are actually people who've put some effort into using 32 bit pointers in 64 bit mode
> for that reason. This mostly only makes sense if you fit into 32 bits of memory, but I'd guess that there
> might be some rare exceptions. Cases where the processes you're breaking up into communicate seldom enough
> that the overhead of that isn't excessive, and where pointer induced cache pressure is large enough. Crazy
> to design the hardware around that use case, but I expect that it does actually exist.
>
Well if you only using < 2Gb why are you porting to 64bit anyway? OK some OSs only support 64bit userspace on a 64bit kernel but that is an OS limitation not a limitation of x86-64. In any case, I have personally implemented that very 32bit ptr optimisation (but mainly as work around for early Athlons not supporting CMPXCHG16B) on at least one occasion so you are preaching to the converted. But don't forget there are far more than just context switch overheads with multi-process approaches v multi-threading, there is a lot of OS memory/kernel context involved with creating a process too. Of course there are some RPC type programming models where multiprocess is a natural fit but that still doesn't mean those individual processes may not benefit from larger address spaces.
I have also used AWE and would generally some it up as slow, cumbersome, comes with gotchas such as physically locked non-paged commits (to simplify OS implementation) which often means elevated privileges are needed. Not nice... And yes DBs often exploit AWE (because it is still far better than hitting disk) but even then native 64bit implementations usually scale (and perform?) better not to mention that kernel address space starts becoming an issue when you start filling it up with PTEs on large PAE systems.
So yes while there are some narrow use cases where these schemes work well I could easily list quite a few where they really don't. In any case, do we all really want to be implementing IPC or swapping managers in our code just to avoid 64bit ptrs? I'll take a bit of ptr size hacking over that any day having done a bit of both.