Exceptions: Difference between revisions

6,683 bytes added ,  31 January 2008
no edit summary
No edit summary
 
(14 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This is a discussion and planning document for refactoring Firefox to use C++ exceptions. Briefly, this means:
This is a discussion and planning document for refactoring Mozilla to use C++ exceptions. Exceptions will likely arrive after [[Mozilla 2]], but we hope to improve out-of-memory (OOM) handling for Mozilla 2 and maybe take a few steps closer to exception safety.


* Enabling exceptions in the compiler
Originally, this page suggested the goal would be to enable exceptions while preserving current behavior. But it looks like preserving current behavior exactly would make the code too ugly (e.g., needing try statements with empty catch blocks for ignored error codes), defeating the main purpose of going to exceptions. The new goal is to go more slowly, improving error handling functionality and cleaning up the code so it looks really nice once exceptions can be turned on.


* Replacing nsresult returns and checks with throw and catch statements
The next steps are:


* Changing getter methods to return their results directly (see [[outparamdel]]) instead of through pointer-typed parameters
* Remove all current OOM handling code


The goal is to do all this without changing the behavior of Firefox, thus not introducing bugs. A somewhat more ambitious goal is to then refactor Firefox methods to be exception-safe as well, which will help avoid future bugs and may even eliminate a few existing bugs.
* Create contracts for outparams
 
* Rewrite call sites that ignore nsresults
 
* Rewrite call sites that use NS_SUCCEEDED
 
These steps are explained in more detail below. And by the way, we'd love to have community help with any of these. Look for "Coding Task" below to find small but useful bites of work that you might be interested in starting on.
 
= Improved OOM Handling =
 
== Background ==
 
As of Mozilla 1.9, the general strategy for handling OOMs is this:
 
* Functions that allocate memory check for a null pointer (returned from new or malloc) and return NS_OUT_OF_MEMORY if allocation failed.
 
* Other functions check for NS_FAILED return codes or NS_OUT_OF_MEMORY errors and try to respond accordingly.
 
This doesn't really work. In practice, if an allocation fails, Firefox crashes within a second. There are many problems with this strategy, but the important ones for exceptions purposes are: (1) no call site should ever ignore an OOM (and it's not practical to achieve that with nsresult NS_OUT_OF_MEMORY), and (2) most of the time, the right response is to activate a centralized routine that tries to free memory (which is best done with exceptions or an allocation failure handler rather than sprinkling OOM checks everywhere).
 
So the new OOM strategy will go something like this:
 
* Allocation operations (new, malloc) will never return null pointers.
 
* No method will return NS_OUT_OF_MEMORY.
 
* No code will check for NS_OUT_OF_MEMORY.
 
* General OOM conditions will go to an OOM handler that immediately crashes. (Later, the handler may be upgraded to try to free memory--this is a separate issue.)
 
* A few allocations that "predictably" cause OOM will be able to check for and respond to OOMs using a special mechanism (e.g., calling a special allocator). The prime example of this kind of allocation is buffers for downloaded images: a huge image could easily cause OOM, and this can be relatively easily handled by the local image code (by not displaying the image).
 
This strategy will be more robust, more easily improved in the future, and it will reduce code size by removing all the OOM checks.
 
== Rewriting OOMs ==
 
There are two rewriting tasks for OOMs:
 
First, we want to '''remove all allocation null checks'''. For example,
 
    Foo *p = new Foo();
    if (p == NULL) return NS_OUT_OF_MEMORY;
    use(p);
 
will be replaced by
 
    Foo *p = new Foo();
    use(p);
 
Second, we want to '''remove all explicit OOM checks'''. For example, in
 
    nsresult rv = p->foo();
    if (rv == NS_OUT_OF_MEMORY) { ... }
 
the if statement would be deleted.
 
For these tasks, we need a full automatic rewriting system, with a pattern detection component and a patch generation component. The example patterns given here can be detected by looking at ASTs, so [[Dehydra GCC]] would be a great tool to use for detecting them. Once we have the list of patterns (say, as line numbers where the if statements start), we can work on an Elsa-based patch generator.
 
Coding Task 1: Dehydra GCC script to detect null pointer checks for allocated memory.
 
Coding Task 2: Dehydra GCC script to detect explicit tests for NS_OUT_OF_MEMORY return value.
 
= Contracts for Outparams =
 
== Background ==
 
Some methods ("getters") 'return' a value other than an nsresult to the caller. Generally, the value is returned through a pointer-typed parameter, a.k.a., outparam. As of Mozilla 1.9, it's not always clear whether the returned value can be null if there is not error, or if the value can be changed if there is an error, etc., making things more confusing for callers.
 
This situation needs to be cleared up for exceptions. With exceptions, we want these methods to return their value directly, which will have performance and developer sanity benefits. In that case, when there is an exception, no value is returned at all to the calling context. Thus, for all methods, '''when the error code is nonzero, outparams should not be modified'''.
 
Some methods have the property that when the return code is NS_OK, the outparam is non-null (equivalently, when the outparam is non-null, the return code is nonzero). We want to identify these methods, because for them, a check for null is a valid way of checking for an error, and should be treated the same as an nsresult check for the other refactorings discussed here.
 
== Code Changes ==
 
Coding Task 3: Create an analysis to check for methods that modify outparams when the return a nonzero nsresult. Eventually this analysis should be a gcc plugin so it can be integrated with the build process.
 
Coding Task 4: Create an analysis to find methods that always return non-null outparams when the nsresult is NS_OK. Eventually, we want to annotate these methods with gcc attributes and enforce the annotation with a gcc plugin.
 
= Fixing Ignored nsresults =
 
== Background ==
 
There are a fair number of call sites that ignore nsresult return values. This can be for several reasons, including:
 
* The caller checks failure using some other condition (e.g., a null return value)
 
* The function being called always returns NS_OK.
 
* At this call site, the caller has ensured that the function will succeed.
 
* The caller doesn't need to respond to errors.
 
These calls need checking before we can enable exceptions. In general, it won't be possible to ensure that a function doesn't throw an exception, especially if we use exceptions for OOM. Thus, call sites that now ignore nsresults need to be looked at and made exception safe.
 
== Finding Ignored nsresults ==
 
The key need here is a tool to automatically find call sites that ignore return values. There is a script under development (by dmandelin) that does this, but it needs to be improved to handle all the special cases, such as checking for a null return value.
 
Coding Task 5: Finish the ignored nsresult analysis.
 
Once the list is in place, the calls will need manual attention.
 
= Removing NS_SUCCEEDED =
 
== Background ==
 
Many call sites check nsresults with NS_SUCCEEDED. For example:
 
    nsresult rv = p->PrepareToUse();
    if (NS_SUCCEEDED(rv)) {
      p->Use();
    }
    return rv;
 
This doesn't make sense with exceptions, because with exceptions enabled, if PrepareToUse returns, then it succeeded.
 
== Investigating NS_SUCCEEDED ==
 
We want to rewrite call sites that use NS_SUCCEEDED to look more like something that will work with exceptions. But it's not clear yet exactly what that looks like. The first step is to analyze some existing uses of NS_SUCCEEDED to get an idea of what patterns exist and how to rewrite them. The example above, with exceptions can look like:
 
    p->PrepareToUse();
    p->Use();
 
Before we have exceptions, we will probably want something like:
 
    nsresult rv = p->PrepareToUse();
    if (NS_FAILED(rv)) return rv;
    p->Use();
    return NS_OK;
 
This would be fairly easy to rewrite to the exceptions version, because the NS_FAILED check is easily identified as equivalent to letting the exception propagate to the caller.
 
 
= Far Future Stuff =
 
This was the original plan for implementing exceptions, now categorized as far future. Also hopefully these steps will be easier once we've done the medium-term stuff documented above.


== Benefits of Exceptions ==
== Benefits of Exceptions ==
Line 65: Line 200:
=== Special Case Error: XPCOM Infrastructure ===
=== Special Case Error: XPCOM Infrastructure ===


<code>QueryInterface</code> may be redesigned so that it never throws an exception. It will simply return a null pointer on failure, and callers must check for it if there is a possibility of failure. (Many <code>QueryInterface</code> calls are guaranteed to succeed.)
<code>QueryInterface</code> has been redesigned so that it never throws an exception. It returns a null pointer on failure, and callers must check for it if there is a possibility of failure. (Many <code>QueryInterface</code> calls are guaranteed to succeed.) (Note: the redesign hasn't been checked in to trunk yet.) [https://bugzilla.mozilla.org/show_bug.cgi?id=391275 See also.]


<code>GetService</code> may be handled the same way. (TBD)
<code>GetService</code> may be handled the same way. (TBD)


<code>CreateInstance</code> might also generate OOMs only, although other failure modes are possible. (TBD)
<code>CreateInstance</code> might also generate OOMs only, although other failure modes are possible. (TBD)


=== XPConnect ===
=== XPConnect ===


XPConnect needs to be rewritten so that when JavaScript calls C++, the FFI catches C++ exceptions and generates JavaScript exceptions if necessary. Also, when C++ calls JavaScript, C++ exceptions should be generated in response to JavaScript exceptions. This is potentially a lot of work, because there are separate XPConnect implementations for every platform and compiler.
XPConnect needs to be rewritten so that when JavaScript calls C++, the FFI catches C++ exceptions and generates JavaScript exceptions if necessary. Also, when C++ calls JavaScript, C++ exceptions should be generated in response to JavaScript exceptions. This is potentially a lot of work, because there are separate XPConnect implementations for every platform and compiler.
=== outparamdel ===
With exceptions in place, <code>nsresult</code> methods become <code>void</code>-returning methods. Thus, methods with an "outparam" (more formally, a pointer-typed argument designated for passing a result out of the called method) will be able to place their result directly in the return value. This will make life easier for programmers and may save a few processor cycles.
We already have a tool called [[outparamdel]] for doing this refactoring. It hasn't been used yet because it can't really be done before exceptions. (Many getter methods always succeed, so we could technically apply [[outparamdel]] without exceptions, but it's risky because it assumes (a) the method will never be changed so that it can fail, and (b) the method isn't implemented in JavaScript. Neither assumption is safe.)


== Automated Refactoring ==
== Automated Refactoring ==
313

edits