The new exact Cacao Garbage Collector
For a long time Cacao used the BoehmGC and didn't care about its heap. And it was good. But time went by and the need for a new exact GC arose. This is what this page is all about.
The following is a list of pages related to the new GC in any way. Feel free to add new ones here.
GCSpecificChanges: List of changes to Cacao (i.e. arch-specific changes)
HeapUsage: A list of all the objects which are placed onto the heap.
GarbageCollectorTuning: How to tune the old BoehmGC.
WeakReferences: Reasoning about how to implement weak, soft, and phantom references.
ObjectHeader: Design of the header for heap objects.
GCPoints: Basics about GC Points in JIT code.
Some empirical data that might be interesting:
GCTimings: A collection of all the realtime-timings for the exact GC.
Design Plan Overview
The main phases of the design and implementation are listed below. Let's see how much we can stick to it. But remember this should be more a guideline than a rule. Don't be too upset if things turn out to take a different (but maybe better) course.
- Simple Mark-and-Sweep GC for a single-threaded environment. Keywords: reference-location, root-set
- Introduce GC-Points to extend GC to multi-threaded environments with stop-the-world. Keywords: thread-suspension, code-patching
- Introduce Generations to dramatically speedup the GC. Keywords: copying-algorithm, backward-references
- Test several GC strategies for the old Generation. Keywords: mostly-concurrent, mostly-parallel, incremental
Remember these important things
ClassUnloading is in section 12.7 of the java language spec here.
Look at classloaders, they need to be consistent for collections. All accesses to classinfo.classloader must be changed. Special collection strategies for classinfos. They need to be collected in one atomic action together with their classloader.
- Heap Objects are considered reachable, if they can be accessed by one of the following ways. I only care about strong references here. Please feel free to complete this list if you are missing something!
- a non-static field (object variable) of another reachable object
- a static field (class variable) of any class
- a field (of type_adr) in an array (which is reachable)
- a local variable of any running (on the stack) JIT method
- an indirection cell (not yet implemented) of a native method
TOCHECK List
Things that need to be checked. Consider these to be unchecked assertions
- The content of local variables cannot change inside an ICMD_INVOKE or ICMD_BUILTIN.
Replacement points are ordered by increasing memory address in code->rplpoints.
TODO List
DONE: Fix: Header flags get destroyed by cloning from outside into the heap.
Where should we place uncollectable items? DONE: Implement simple compaction pass.
Think of a suitable structure for the root set.
Think of a solution for the nasty "recovering local variables" problem.
DONE PARTIALLY: Try to walk down the stack to find references. DONE: Implement simple recursive marking pass of objects.
DONE PARTIALLY: How should fieldinfos (moved off the heap) and localref_tables on the heap be recognized?
DONE PARTIALLY: Addpat java_objectheader to fit the new ObjectHeader (Comments still missing). DONE: Wrap a java object around stackframebuffer when placed onto the heap.
DONE: Remove comment about GCNEW(methodinfo, ...) from loader.c.
DONE: Rename src/mm/boehm.h to src/mm/gc.h (or something like that) because it is the header file for every GC.
DONE: Insert ifdefs for all the BoehmGC specific calls of heap_allocate(). See HeapUsage for details.
DONE: Adapt the buildsystem to integrate the new GC.
DONE: Move BoehmGC to src/mm/boehm-gc/