By: Tzvetan Mikov (tmikov.delete@this.gmail.com), October 19, 2006 9:04 am
Room: Moderated Discussions
sp (no@thanks.com) on 10/19/06 wrote:
---------------------------
>One can certainly respect Leroy's design trade offs for the last decade, and he
>has produced a wonderful system. But since the time of that message, the "last convulsion"
>of SMP has been quite reversed. Dual and soon quad core will be common place, as
>will dual socket machines in the small clusters I deal a lot with (up to 32 or 64 node, not capability sized machines).
>
>I'd happily take a moderate single threaded hit on GC if it meant I could scale
>out to use the 4-8 cores per node, even if I end up using mpi between nodes. Much
>as I might use openmp and mpi in traditional languages. For my programs, a hit in
>GC performance would be compensated for with an improvement in vector performance.
>
>Thinking of GCs, this raises another question for me. Why can the independent threads
>not use independent (sequential) GC+heaps? The majority of my allocations and collection
>would be of non-shared data. For shared data, I'll life with a performance hit where
>sharing involves some bit more expensive runtime interaction to hand the data off for management in a shared GC.
>
>On a 32 bit machine I can imagine running into address space limits when carving
>up the space for multiple heaps, but on 64 bit machines I surely have address space
>to burn - and my actual memory usage will be the same.
>
>What are the downsides to such a scheme?
I believe IBM's JVM uses something like that:
http://www-128.ibm.com/developerworks/ibm/library/i-garbage1/
It isn't a separate heap per se, but it allows faster allocation of small objects.
---------------------------
>One can certainly respect Leroy's design trade offs for the last decade, and he
>has produced a wonderful system. But since the time of that message, the "last convulsion"
>of SMP has been quite reversed. Dual and soon quad core will be common place, as
>will dual socket machines in the small clusters I deal a lot with (up to 32 or 64 node, not capability sized machines).
>
>I'd happily take a moderate single threaded hit on GC if it meant I could scale
>out to use the 4-8 cores per node, even if I end up using mpi between nodes. Much
>as I might use openmp and mpi in traditional languages. For my programs, a hit in
>GC performance would be compensated for with an improvement in vector performance.
>
>Thinking of GCs, this raises another question for me. Why can the independent threads
>not use independent (sequential) GC+heaps? The majority of my allocations and collection
>would be of non-shared data. For shared data, I'll life with a performance hit where
>sharing involves some bit more expensive runtime interaction to hand the data off for management in a shared GC.
>
>On a 32 bit machine I can imagine running into address space limits when carving
>up the space for multiple heaps, but on 64 bit machines I surely have address space
>to burn - and my actual memory usage will be the same.
>
>What are the downsides to such a scheme?
I believe IBM's JVM uses something like that:
http://www-128.ibm.com/developerworks/ibm/library/i-garbage1/
It isn't a separate heap per se, but it allows faster allocation of small objects.