JavaScript:SpiderMonkey:GC Futures: Difference between revisions
(Mark as Outdated) |
|||
(31 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
{{OutdatedSpiderMonkey}} | |||
== JS GC Futures == | == JS GC Futures == | ||
Line 6: | Line 8: | ||
* Speed up allocator | * Speed up allocator | ||
** | ** Remove reserved objects and doubles stuff in the tracer [https://bugzilla.mozilla.org/show_bug.cgi?id=508140 bug 508140] | ||
** Use one single GC heap chunk, avoiding frequent mmap and malloc calls [https://bugzilla.mozilla.org/show_bug.cgi?id=508707 bug 508707] | |||
** [https://bugzilla.mozilla.org/show_bug.cgi?id= | |||
* Speed up collector | * Speed up collector | ||
** Allocate short-enough strings from GC heap, not malloc heap | ** Allocate short-enough strings from GC heap, not malloc heap [https://bugzilla.mozilla.org/show_bug.cgi?id=402614 bug 402614] | ||
** Based on [https://bugzilla.mozilla.org/show_bug.cgi?id=502736 Gregor's stats] consider making 32-byte and 64-byte JSObjects to cover most cases except large objects without any dslots [https://bugzilla.mozilla.org/show_bug.cgi?id=508357 bug 508357] | ** Based on [https://bugzilla.mozilla.org/show_bug.cgi?id=502736 Gregor's stats] consider making 32-byte and 64-byte JSObjects to cover most cases except large objects without any dslots [https://bugzilla.mozilla.org/show_bug.cgi?id=508357 bug 508357] | ||
* | ** Schedule GC based on memory pressure [https://bugzilla.mozilla.org/show_bug.cgi?id=506125 bug 506125] | ||
* Conservative stack scanning to avoid temp-value rooting overheads? [https://bugzilla.mozilla.org/show_bug.cgi?id=516832 bug 516832] | |||
* Go to beach | * Go to beach | ||
==Compartments and per-compartment GC== | |||
===Motivation=== | |||
* More robust thread safety API | |||
* Reduce GC pauses | |||
* Minor performance win from eliminating object locking | |||
'''Threading.''' In short, SM's support for sharing objects among threads has not gotten the maintenance attention it needs. It's accumulating bugs. At the same time, it is now widely acknowledged that shared-everything threads are a problematic programming model. Better ideas (like HTML5 Workers) are emerging. We need new JSAPI support for shared-nothing threads. | |||
'''GC.''' At the same time our GC performance is not where we want it. There are | |||
still direct optimization opportunities. Beyond that, we want to be able to perform GC on a single browser tab, for shorter pause times when only one tab is active. | |||
What these two things have in common is that they both involve dividing the JSRuntime's object graph into separate, mostly-isolated ''compartments'', such that every object is in a compartment, and an ordinary object cannot have a direct reference to an object in a different compartment. (We will support special ''wrapper'' objects that provide transparent access to objects in other compartments.) | |||
The changes we need to make for single-tab GC will simultaneously make it | |||
easier to add assertions enforcing the new rules about sharing objects across | |||
threads. Conveniently, in the browser, wrapper objects already exist in almost | |||
all the right places--it's a critical security boundary in the browser, and the | |||
wrappers impose various security policies. | |||
===Plans=== | |||
These plans are limited to bugs that are on the critical path to per-tab GC, which we hope will dramatically reduce pause time. | |||
{| width="80%" cellspacing="1" cellpadding="6" border="0" | |||
|- | |||
! scope="col" | Name | |||
! scope="col" | Size (weeks) | |||
! scope="col" | Assigned to (guess) | |||
|- | |||
| Benchmarks | |||
| 1 | |||
| gwagner | |||
|- | |||
| Benchmark automation | |||
| 0.5 | |||
| jorendorff | |||
|- | |||
| Compartments and wrappers API | |||
| 1 | |||
| jorendorff | |||
|- | |||
| Compartmentalize Gecko | |||
| 3 | |||
| jorendorff | |||
|- | |||
| GCSubheaps | |||
| 2-3 | |||
| gwagner | |||
|- | |||
| MT wrappers | |||
| 3 | |||
| gal | |||
|- | |||
| Lock-free allocation and slot access | |||
| 1 | |||
| gal | |||
|- | |||
| Compartmental GC | |||
| done by the end of June | |||
| gwagner | |||
|} | |||
'''Benchmarks''' — Gregor is building a GC benchmark suite. We need it checked in. ({{bug|548388}}) | |||
'''Benchmark automation''' — We need to be able to turn a crank and get GC performance numbers. Talos needs to run this automatically. This means we need to be able to measure GC performance in opt builds. Total time spent in GC and max pause time are cheap enough to collect. ({{bug|561486}}) | |||
'''Compartments and wrappers API''' — Add API for creating a global object and associating it with a compartment. Add minimal API for a special kind of object that is allowed to hold a strong reference across compartments (a wrapper object). Add assertions within the engine that there are no direct references across compartments. Add assertions at API boundaries that all the gc-things provided as arguments come from the same compartment. ({{bug|563099}}) | |||
'''Compartmentalize Gecko''' — Use the compartment API to divide up Gecko so that objects with different principals are always in different compartments. Use the wrapper API in XPConnect for our security wrappers. Fix what breaks. In particular, wrappers and the structured clone algorithm will need to copy strings and doubles instead of passing them freely from one compartment to another. ({{bug|563106}}) | |||
'''GCSubheaps''' — Factor GC-related code into a class, js::GCHeap ({{bug|556324}}). Carve out a second class, GCSubheap, so that a single GCHeap can have several GCSubheaps, each of which handles its own set of VM pages from which individual GC things may be allocated. Give each compartment its own GCSubheap. Allocate every object, double, and string from the GCSubheap for the compartment where it will live. | |||
'''MT wrappers''' — Implement an automatically-spreading membrane of proxy objects that allow one thread to access objects from another compartment that is running in another thread. There is some risk here because it's unclear how this should work regarding objects on the scope chain (global objects, Call and Block objects). See {{bug|558866}} comments 1-4. See also {{bug|566951}}, which has a prototype patch that addresses the scope chain issue. | |||
'''Lock-free allocation and slot access''' — Remove locking from allocation paths. Remove scope locking everywhere. ({{bug|558866}}) | |||
'''Compartmental GC''' — Support collecting garbage in one GCSubheap without walking the rest of the graph ({{bug|558861}}) and without stopping other threads. | |||
<!-- | |||
* Pursue opportunities to make GC faster without API changes. For example, string and object destruction (not finalizers) could be moved to the background thread. | |||
* Use membranes to separate threads and GC heaps, as well as for security. This means making the membranes more robust, eliminating all leaks. | |||
* Implement per-region garbage collection. | |||
* Eventually: implement "worker modules" so addons have a non-broken way to use threads. | |||
We have no immediate plans to implement cross-thread proxying. I'm not sure it would be sufficient for existing embeddings that share objects among threads; and it seems no one else needs it. The debugger needs to be able to touch objects across all threads, but proxying objects transparently across threads may not be the right answer there either. (jimb's first impression was that it is not.) I don't think enough is known about the new debugger's eventual design. By mid-March 2010 we'll know more. | |||
There are 122 files using nsIThread in addons.mozilla.org. Locking wrappers are a possible solution for these guys, but it's likely they would ''still'' need source changes, in which case we should just encourage their authors to migrate to postMessage. | |||
==Tasks== | |||
* Igor: optimize GC and move finalization to background | |||
* jorendorff: shell workers. tests for what we think the new contract should be. | |||
* jorendorff: design whatever new JSAPI we need to support "regions" | |||
* jorendorff: add assertions that test our assumption that objects aren't shared. | |||
* jorendorff: ask bz about same-origin "regions" already implemented somewhere in Gecko ("window target algorithm") | |||
* jorendorff: ask gwagner for his GC benchmarks so we have a baseline | |||
* ???: review existing code that shares objects across threads | |||
What is sharing objects across threads now? | |||
* DocShell scripting cross-thread | |||
* multithreaded xpcom | |||
* UrlClassifier | |||
* Personal Security Manager | |||
(Proxy autoconfig is main-thread only.) | |||
==References== | |||
[http://www.adambarth.com/ Preventing Capability Leaks in Secure JavaScript Subsets]. Matthew Finifter, Joel Weinberger, and Adam Barth. To appear: Proc. of the 17th Network and Distributed System Security Symposium (NDSS 2010). | |||
--> | |||
<!-- | |||
Someone proposed the following plan: | |||
* New API for making MT-accessible objects | |||
* Objects are ST-accessible only without new API usage | |||
* Safest course for API: #ifdef JS_AUTO_MT_OBJECTS for existing API, new API entry points (C++ API, rather) for new ST vs. MT create-object methods [bug NNNNNNN] | |||
* Switch mozilla-central code over, turn off JS_AUTO_MT_OBJECTS [bug NNNNNNN] | |||
This won't work because I plan to implement MT objects using custom JSObjectOps. So JS_AUTO_MT_OBJECTS would make all objects non-native, which would break major engine invariants (e.g. the global object must be native). | |||
--> | |||
=== Compartments and wrappers - API === | |||
Each runtime has a ''default compartment'' which contains interned strings, the empty string, +Inf, -Inf, NaN—and, in non-compartment-aware embeddings, everything else. | |||
Each context has a ''current compartment'', initially the default compartment, and normally the compartment of JS_GetScopeChain(cx). So, for example, <code>js_Atomize(cx, name, strlen(name), 0)</code> allocates the new string from <code>cx->currentCompartment()</code>. | |||
A tricky consequence of this is that in an API call <code>JS_SetProperty(cx, obj, name, vp)</code>, obj must be in <code>cx->currentCompartment()</code>, because <code>JS_SetProperty</code> calls <code>js_Atomize</code> to create the property id. Otherwise obj could end up with property ids that reside in <code>cx->currentCompartment()</code> rather than its own compartment: a violation of the rules that will lead to a crash in GC. | |||
When does a context's current compartment need to change? Only when setting up a new compartment and when calling across compartment boundaries. The latter always happens in wrapper code. So this will be rare and we can require an API call and fall off trace when it happens. | |||
JSCompartment * | |||
JS_GetDefaultCompartment(JSRuntime *rt); | |||
JSCompartment * | |||
JS_NewCompartment(JSRuntime *rt, JSPrincipals *principals); | |||
JSCompartment * | |||
JS_GetCurrentCompartment(JSContext *cx); | |||
void | |||
JS_SetCurrentCompartment(JSContext *cx, JSCompartment *compartment); | |||
All existing APIs could just assert that cx->currentCompartment() agrees with all the arguments that happen to be gc-things. The precise rule is: each gc-thing passed in must either be in <code>cx->currentCompartment()</code> or be a string or double in <code>cx->runtime->defaultCompartment</code>. | |||
'''GC roots''' will be per-compartment. This means <code>JS_AddGCRootRT</code> will add a root to the default compartment. This behavior is a bit unexpected. To help embeddings get this right, GC should assert that each gc thing pointed to by a root is in the expected compartment. | |||
'''API functions related to the GC heap.''' Several API functions do something that involves the GC heap: <code>JS_GC</code> and friends; <code>JS_SetGCCallback</code>; <code>JS_SetGCParameter</code>; <code>JS_TraceChildren</code> and friends; <code>JS_DumpHeap</code>; <code>JS_SetGCZeal</code>. These will need to have a mode for collecting/walking the entire heap and a separate mode where they apply to just one compartment. TBD. | |||
'''Wrapper API.''' Since the cross-compartment reference from a wrapper to the wrappee is so special, we will need API for it. TBD. | |||
== Emerging Invariants == | |||
This section describes invariants and rules which have emerged during initial development of the conservative GC and the compartments code. They are not likely to change, but still may. | |||
* The C stack is not scanned for GC roots when there are no contexts (suspended or otherwise) in requests on a given thread | |||
* When doing a single-compartment GC, only the current thread's stack is scanned (unless there are no contexts in requests on that thread) | |||
* A context's compartment is equal to JS_GetScopeChain(cx)->getCompartment. A NULL scope chain indicates the default compartment. | |||
* Corollary: All non-default compartments have at least a global object. | |||
* Only one thread per compartment may be in a request at any given time |
Latest revision as of 21:59, 27 April 2021
JS GC Futures
Tracking bug is bug 505308
Brain-dump of work items:
- Speed up allocator
- Remove reserved objects and doubles stuff in the tracer bug 508140
- Use one single GC heap chunk, avoiding frequent mmap and malloc calls bug 508707
- Speed up collector
- Allocate short-enough strings from GC heap, not malloc heap bug 402614
- Based on Gregor's stats consider making 32-byte and 64-byte JSObjects to cover most cases except large objects without any dslots bug 508357
- Schedule GC based on memory pressure bug 506125
- Conservative stack scanning to avoid temp-value rooting overheads? bug 516832
- Go to beach
Compartments and per-compartment GC
Motivation
- More robust thread safety API
- Reduce GC pauses
- Minor performance win from eliminating object locking
Threading. In short, SM's support for sharing objects among threads has not gotten the maintenance attention it needs. It's accumulating bugs. At the same time, it is now widely acknowledged that shared-everything threads are a problematic programming model. Better ideas (like HTML5 Workers) are emerging. We need new JSAPI support for shared-nothing threads.
GC. At the same time our GC performance is not where we want it. There are still direct optimization opportunities. Beyond that, we want to be able to perform GC on a single browser tab, for shorter pause times when only one tab is active.
What these two things have in common is that they both involve dividing the JSRuntime's object graph into separate, mostly-isolated compartments, such that every object is in a compartment, and an ordinary object cannot have a direct reference to an object in a different compartment. (We will support special wrapper objects that provide transparent access to objects in other compartments.)
The changes we need to make for single-tab GC will simultaneously make it easier to add assertions enforcing the new rules about sharing objects across threads. Conveniently, in the browser, wrapper objects already exist in almost all the right places--it's a critical security boundary in the browser, and the wrappers impose various security policies.
Plans
These plans are limited to bugs that are on the critical path to per-tab GC, which we hope will dramatically reduce pause time.
Name | Size (weeks) | Assigned to (guess) |
---|---|---|
Benchmarks | 1 | gwagner |
Benchmark automation | 0.5 | jorendorff |
Compartments and wrappers API | 1 | jorendorff |
Compartmentalize Gecko | 3 | jorendorff |
GCSubheaps | 2-3 | gwagner |
MT wrappers | 3 | gal |
Lock-free allocation and slot access | 1 | gal |
Compartmental GC | done by the end of June | gwagner |
Benchmarks — Gregor is building a GC benchmark suite. We need it checked in. (bug 548388)
Benchmark automation — We need to be able to turn a crank and get GC performance numbers. Talos needs to run this automatically. This means we need to be able to measure GC performance in opt builds. Total time spent in GC and max pause time are cheap enough to collect. (bug 561486)
Compartments and wrappers API — Add API for creating a global object and associating it with a compartment. Add minimal API for a special kind of object that is allowed to hold a strong reference across compartments (a wrapper object). Add assertions within the engine that there are no direct references across compartments. Add assertions at API boundaries that all the gc-things provided as arguments come from the same compartment. (bug 563099)
Compartmentalize Gecko — Use the compartment API to divide up Gecko so that objects with different principals are always in different compartments. Use the wrapper API in XPConnect for our security wrappers. Fix what breaks. In particular, wrappers and the structured clone algorithm will need to copy strings and doubles instead of passing them freely from one compartment to another. (bug 563106)
GCSubheaps — Factor GC-related code into a class, js::GCHeap (bug 556324). Carve out a second class, GCSubheap, so that a single GCHeap can have several GCSubheaps, each of which handles its own set of VM pages from which individual GC things may be allocated. Give each compartment its own GCSubheap. Allocate every object, double, and string from the GCSubheap for the compartment where it will live.
MT wrappers — Implement an automatically-spreading membrane of proxy objects that allow one thread to access objects from another compartment that is running in another thread. There is some risk here because it's unclear how this should work regarding objects on the scope chain (global objects, Call and Block objects). See bug 558866 comments 1-4. See also bug 566951, which has a prototype patch that addresses the scope chain issue.
Lock-free allocation and slot access — Remove locking from allocation paths. Remove scope locking everywhere. (bug 558866)
Compartmental GC — Support collecting garbage in one GCSubheap without walking the rest of the graph (bug 558861) and without stopping other threads.
Compartments and wrappers - API
Each runtime has a default compartment which contains interned strings, the empty string, +Inf, -Inf, NaN—and, in non-compartment-aware embeddings, everything else.
Each context has a current compartment, initially the default compartment, and normally the compartment of JS_GetScopeChain(cx). So, for example, js_Atomize(cx, name, strlen(name), 0)
allocates the new string from cx->currentCompartment()
.
A tricky consequence of this is that in an API call JS_SetProperty(cx, obj, name, vp)
, obj must be in cx->currentCompartment()
, because JS_SetProperty
calls js_Atomize
to create the property id. Otherwise obj could end up with property ids that reside in cx->currentCompartment()
rather than its own compartment: a violation of the rules that will lead to a crash in GC.
When does a context's current compartment need to change? Only when setting up a new compartment and when calling across compartment boundaries. The latter always happens in wrapper code. So this will be rare and we can require an API call and fall off trace when it happens.
JSCompartment * JS_GetDefaultCompartment(JSRuntime *rt); JSCompartment * JS_NewCompartment(JSRuntime *rt, JSPrincipals *principals); JSCompartment * JS_GetCurrentCompartment(JSContext *cx); void JS_SetCurrentCompartment(JSContext *cx, JSCompartment *compartment);
All existing APIs could just assert that cx->currentCompartment() agrees with all the arguments that happen to be gc-things. The precise rule is: each gc-thing passed in must either be in cx->currentCompartment()
or be a string or double in cx->runtime->defaultCompartment
.
GC roots will be per-compartment. This means JS_AddGCRootRT
will add a root to the default compartment. This behavior is a bit unexpected. To help embeddings get this right, GC should assert that each gc thing pointed to by a root is in the expected compartment.
API functions related to the GC heap. Several API functions do something that involves the GC heap: JS_GC
and friends; JS_SetGCCallback
; JS_SetGCParameter
; JS_TraceChildren
and friends; JS_DumpHeap
; JS_SetGCZeal
. These will need to have a mode for collecting/walking the entire heap and a separate mode where they apply to just one compartment. TBD.
Wrapper API. Since the cross-compartment reference from a wrapper to the wrappee is so special, we will need API for it. TBD.
Emerging Invariants
This section describes invariants and rules which have emerged during initial development of the conservative GC and the compartments code. They are not likely to change, but still may.
- The C stack is not scanned for GC roots when there are no contexts (suspended or otherwise) in requests on a given thread
- When doing a single-compartment GC, only the current thread's stack is scanned (unless there are no contexts in requests on that thread)
- A context's compartment is equal to JS_GetScopeChain(cx)->getCompartment. A NULL scope chain indicates the default compartment.
- Corollary: All non-default compartments have at least a global object.
- Only one thread per compartment may be in a request at any given time