By: Rob Thorpe (rthorpe.delete@this.realworldtech.com), October 23, 2006 2:51 am
Room: Moderated Discussions
Tzvetan Mikov (tmikov@gmail.com) on 10/19/06 wrote:
---------------------------
>The solution is to have a memory barrier after every heap allocation, even if every
>thread uses a separate heap space for its allocations. It completely destroys the
>myth that allocation in a copying GC is as fast as stack allocation.
It's worth mentioning that this old chestnut about allocation in a copying GC is a troublesome myth.
Apart from the GC issue you mention there are lots of other problems with it:-
Checking
--------
Normally the size of the memory arena is restricted. If this is the case then when the free-space-pointer is used a check is needed to make sure it hasn't moved off the top/bottom of the arena. Several of these checks may be compounded together into one by the compiler. Sometimes the same limitation applies to stack allocation, but not always.
Space
-----
In a pure copying GC all objects that aren't on the stack are held in the arena. The GC must be capable of flipping all of the info on one side of the arena onto the other. This means that in the worst case the program will use twice the amount of memory it actually needs. The solution to this is to put large objects such as strings and arrays somewhere else in memory outside the arena being GCed and keep only the pointers. This is a great idea, but if it's done then allocation requires more work than just moving the free-space-pointer.
Speed
-----
Pure copying GCs are very unfriendly to memory. When a flip occurs a huge amount of memory must be traversed. This happens in an unpleasant cache-unfriendly manner too. Since, each pointer in the root set is chased down to it's leaves and everything located is copied elsewhere. So, even if allocation is made very fast, deallocation can be slow. Only if the whole arena fits in cache are things really fast.
In the end the GCs people use are generational ones. Some of these make use of a copying GC arena, but only to provide the first generation. This removes the speed and space problems, but the checking remains. Also, often large objects are still treated specially.
---------------------------
>The solution is to have a memory barrier after every heap allocation, even if every
>thread uses a separate heap space for its allocations. It completely destroys the
>myth that allocation in a copying GC is as fast as stack allocation.
It's worth mentioning that this old chestnut about allocation in a copying GC is a troublesome myth.
Apart from the GC issue you mention there are lots of other problems with it:-
Checking
--------
Normally the size of the memory arena is restricted. If this is the case then when the free-space-pointer is used a check is needed to make sure it hasn't moved off the top/bottom of the arena. Several of these checks may be compounded together into one by the compiler. Sometimes the same limitation applies to stack allocation, but not always.
Space
-----
In a pure copying GC all objects that aren't on the stack are held in the arena. The GC must be capable of flipping all of the info on one side of the arena onto the other. This means that in the worst case the program will use twice the amount of memory it actually needs. The solution to this is to put large objects such as strings and arrays somewhere else in memory outside the arena being GCed and keep only the pointers. This is a great idea, but if it's done then allocation requires more work than just moving the free-space-pointer.
Speed
-----
Pure copying GCs are very unfriendly to memory. When a flip occurs a huge amount of memory must be traversed. This happens in an unpleasant cache-unfriendly manner too. Since, each pointer in the root set is chased down to it's leaves and everything located is copied elsewhere. So, even if allocation is made very fast, deallocation can be slow. Only if the whole arena fits in cache are things really fast.
In the end the GCs people use are generational ones. Some of these make use of a copying GC arena, but only to provide the first generation. This removes the speed and space problems, but the checking remains. Also, often large objects are still treated specially.