XPCOMGC: Difference between revisions

2,870 bytes added ,  19 September 2008
Major status update
m (+link for MMgc)
(Major status update)
 
(13 intermediate revisions by 3 users not shown)
Line 1: Line 1:
XPCOMGC is the [[Mozilla 2]] project to convert the XPCOM object model from reference counting to use [[mdc:MMgc|MMgc]] garbage collection.  
XPCOMGC is the [[Mozilla 2]] project to convert the XPCOM object model from reference counting to use garbage collection, and unify with the JS engine memory management.


== General Info ==
== General Info ==


TODO: collect/format information from the [http://groups.google.com/group/mozilla.dev.tech.xpcom/browse_frm/thread/ed4c1390b5ea97c8/01d69c339743e04c#01d69c339743e04c newsgroup discussion].
Tracking is {{bug|XPCOMGC}}.
 
== Rationale ==
 
'''We need to be able to free cycles of XPCOM objects.'''  Mozilla 1.9 introduced a cycle collector to do just that.  We've already removed hacks to avoid creating reference cycles, so we're now dependent on this.
 
'''The cycle collector should be replaced with a real GC.'''  The cycle collector is complex.  It [[mdc:Interfacing_with_the_XPCOM_cycle_collector|requires cooperation]] from the objects that might appear in cycles.  It interacts with the JS garbage collector in a way that verges on black magic.
 
A single true GC covering both XPCOM objects and JS is a much more direct approach.  It will not make it much easier to debug memory leaks or crashes, but the GC itself will be faster and easier to maintain, and client code will be simpler.
 
The big advantage of the cycle collector was a software engineering win: it did not touch any of the reference-counting, the addrefs and releases in XPCOM client code.  Now that we have better static analysis and rewriting tools, we can consider true GC.
 
'''We can't use a Java-style copying GC.'''  The fastest GCs are generational copying GCs.  But a fully copying GC is unsuitable for C++.  When GC happens, if any pointers are in registers or stack locations that the GC doesn't know about, the objects they point to must not be moved.  The only solutions are:
 
* Get information from the compiler about stack locations.  (This is what Java does; the information is supplied by the JIT.  It's basically impossible to do this for C++.)
 
* Conservatively scan the stack for pointers to GC-managed memory and don't move those objects.
 
* Don't use a GC that moves stuff around in memory.  (This seems like the best approach.)
 
The amazing speed of a copying, generational GC depends on its being able to move all objects out of a generation.  Then that whole region of memory is available for new allocations, which leads to an incredibly fast allocation routine—a few instructions in some cases.  This kind of design is not viable for Mozilla, given our dependence on C++.
 
'''The GC should have conservative stack scanning.'''  This makes life a lot easier for developers.
 
'''The GC has to support existing uses of threads in Mozilla.'''  This pushes us toward using the JS [[mdc:JS_THREADSAFE|request model]].
 
'''The GC needs incremental or generational collection''' to avoid long pauses, which would regress perceived responsiveness.
 
'''Interoperability with languages other than C/C++ and JavaScript is not a priority.'''  [[mdc:PyXPCOM|Python]] will have to catch up later, if anyone is willing and able to carry it forward.
 
== Building XPCOMGC ==
 
XPCOMGC work is currently taking place in the following Mercurial repository: http://hg.mozilla.org/users/bsmedberg_mozilla.com/gcmonkey
 
It is being maintained as a linear sets of changes on top of mozilla-central, rebased relatively frequently. Thus you will see old heads in the repository, and can ignore them (they are dead heads, though Mercurial doesn't have a way to notate that).
 
== Current Status ==
 
* The build only works on Linux
* Boehm is inserted as a replacement for the C allocator malloc/free
* malloc/free allocations are treated as "uncollectable". That is, they are scanned for pointers but are not subject to being freed by the collector
* XPCOM string buffers (nsStringBuffer) have been made collectable. They are no longer refcounted, but instead are made immutable-on-share
* The build runs and seems to perform "ok"... still trying to get quantified performance numbers
* Memory usage is 50-100% worse than using the jemalloc allocator
 
== Old Information ==


== Tasks ==
A previous attempt at this project was made using the MMgc convervative collector. Because this collector requires programmatic write barriers, and for other reasons, this attempt was abandoned (though we learned a lot!)


* Add the request model threadsafety to MMgc
TODO: collect/format information from the [http://groups.google.com/group/mozilla.dev.tech.xpcom/browse_frm/thread/ed4c1390b5ea97c8/01d69c339743e04c#01d69c339743e04c newsgroup discussion].
* Give MMgc the ability to recognize "inner" pointers to objects as typically used by C++ multiple inheritance {{bug|388070}}
* Make the world depend on a common MMgc
* Rewrite XPCOM addref/release handling
** Remove the cycle collector
** Use textual search/replace to remove most calls to NS_ADDREF/NS_RELEASE
** Use automatic finding to identify remaining references for manual cleanup
** [[XPCOMGC/GCObject Inheritance|Make all COM objects inherit from GCObject]]
** Rewrite nsCOMPtr+friends to be a lightweight wrapper for GC writebarriers.
** [[XPCOMGC/Stack Pointers|Make stack pointers raw pointers]]
** Fix some COM-holding utility classes:
*** nsCOMArray
*** hashtables: nsInterfaceHashtable and nsInterfaceHashKey
** Rewrite XPCOM weakrefs to be GCWeakRefs
*** And remove those that can be regular GC references
* Identify and deal with multi-threading, especially
** Initialize and suspend requests around blocking activity
** Analyze code for deadlock possibilities.
Confirmed users, Bureaucrats and Sysops emeriti
1,217

edits