313
edits
No edit summary |
|||
Line 37: | Line 37: | ||
'''Ignore <code>nsresult</code>.''' Call sites that ignore the <code>nsresult</code> must be wrapped in <code>try { call() } catch {}</code>. If we can prove that the call will never throw an exception, which we can currently do for about 1/3 of this category, we could remove the wrapper. However, this is potentially dangerous for the future: the called method may someday throw or propagate exceptions. Thus, it is better to make the call site exception-safe before removing the wrapper. | '''Ignore <code>nsresult</code>.''' Call sites that ignore the <code>nsresult</code> must be wrapped in <code>try { call() } catch {}</code>. If we can prove that the call will never throw an exception, which we can currently do for about 1/3 of this category, we could remove the wrapper. However, this is potentially dangerous for the future: the called method may someday throw or propagate exceptions. Thus, it is better to make the call site exception-safe before removing the wrapper. | ||
''' | '''Propagate <code>nsresult</code>.''' This is when the method containing the call site simply propagates the error code to its caller without executing any other code (except destructors). The macro <code>NS_ENSURE_SUCCESS</code> does this. This category is very nice: with exceptions, there is no error handling code at all. To remove the existing error handling code, we can simply remove the assignment of the error code out of the call site and remove the return statement. This may create a dead <code>rv</code> variable, which can be deleted as a (conceptual) separate step. | ||
We place call sites that log the error into this category as well: logging will be needed less with exceptions. | We place call sites that log the error into this category as well: logging will be needed less with exceptions. | ||
'''Compensate, then | '''Compensate, then propagate <code>nsresult</code>.''' This is when the method containing the call site propagates the error code, but only after running compensation code. Compensation code restores some invariant that would otherwise be lost. The prototypical example is releasing resources to avoid leaks. | ||
The preferred (by the designers of C++) pattern for this scenario is to create a stack object whose destructor maintains the invariant. This is particularly easy to do if the <code>nsresult</code> compensation code executes for both the failure and success cases: we wrap the compensation code in an object destructor and define an instance just before the call site. | The preferred (by the designers of C++) pattern for this scenario is to create a stack object whose destructor maintains the invariant. This is particularly easy to do if the <code>nsresult</code> compensation code executes for both the failure and success cases: we wrap the compensation code in an object destructor and define an instance just before the call site. | ||
Line 82: | Line 82: | ||
== Automated Refactoring == | == Automated Refactoring == | ||
There are on the order of 50,000 call sites affected by the refactoring to use exceptions, so we want to automate as much as possible. All of the basic patterns discussed above can be detected by static analysis and rewritten automatically, although there are always corner cases that are difficult to automate. We hope the corner cases are a small fraction that can be fixed by hand in reasonable time. | |||
== | === Code Text Quality === | ||
One key issue with automatic rewriting is the quality of the resulting source code text from a human reader's point of view. We don't want to automatically produce ugly exception code--one of our top-level goals is to make the code more readable. | |||
This issue needs work. We'll probably have to discover the requirements through trial and review. | |||
=== Static Analysis Infrastructure === | |||
Refactoring relies on static analysis to (a) detect patterns to be refactored, and (b) verify the safety of the refactoring. Verifying the safety of the refactoring means ensuring that the refactored program will behave identically to the original. Note that the safety verification is typically not sound--sound refactorings are very hard to create. But we hope to create "almost sound" refactorings that will work in practice. Of course, the refactored code must always be reviewed and tested. | |||
Our static analysis infrastructure includes [Pork]-Oink/Elsa, [Dehydra] Classic, and Dehydra GCC. We can already do a decent job on many of the exceptions refactorings using this infrastructure. The only thing we really miss at this point is an abstract interpretation framework based on intraprocedural CFGs. We also lack an alias analysis, but we don't think precise alias analysis is important for this task. | |||
== Refactoring patterns == | |||
Here we discuss automatic refactoring as applied to each pattern described above. | |||
=== Ignore === | |||
For the '''ignore''' pattern, we need an analysis to detect ignored <code>nsresult</code> return values and then a rewriting to wrap the call in a try block with empty catch. Optionally, the rewriting can also produce an annotation (comment or macro) at call sites that always succeed (assuming no JavaScript implementations) to aid in the later process of making the code exception-safe. | |||
The ignore analysis can be formulated like this: A call site return value is ignored if along all forward code paths the value never reaches the condition of a branch statement and never escapes the method by being stored in a field, global variable, outparam, or argument of a called method. | |||
To be accurate, the analysis really should be flow-sensitive (i.e., trace forward paths, and pay attention to statement ordering), because the same <code>rv</code> variable is often used for the results of many calls. | |||
This analysis has been coded as a Dehydra script, and it works fairly well. But there is a limitation because there are too many paths to explore them all in Dehydra, which introduces unsoundness. However, the analysis could be expressed efficiently and without this unsoundness using abstract interpretation. | |||
The rewriting is simple: the call expression is wrapped in a try block with empty catch. We need to think about whether this should be done with a macro, inline on one line, or on more than one line. | |||
=== Propagate === | |||
For the '''propagate''' pattern, we need an analysis to detect call sites that on error immediately return the same error code, and a rewriting to remove the error code storage, testing, and return. | |||
The | The propagate analysis can be formulated like this: A call site return value is propagated directly if along all forward code paths where the value is non-zero the value is returned before any side effects other than logging occur. Side effects include writing to a variable that escapes the method or calling another method. | ||
Dehydra is a good fit for this analysis because in practice, this pattern can be detected or rejected by exploring very short paths forward from the call site: typically either the value is returned or a side effect is produced almost immediately. | |||
One problem with producing this analysis in Dehydra is that Dehydra doesn't represent <code>if</code> conditions precisely. Dehydra is good at detecting things like <code>if (NS_FAILED(rv))</code> (assuming that we have patched FF to make <code>NS_FAILED</code> an inline function, which we have), but doesn't detect things like <code>if (NS_FAILED(rv) || foo)</code>. The simple case is the most common, so we might be able to get away with doing the more complex ones manually, but the complex cases are common enough that they are probably worth automating as well. | |||
Rewriting '''propagate''' sites is potentially tricky, because the analysis property is defined over control-flow paths, but rewriting source code generally operates on ASTs. Hence, we're not prepared to say just yet how this will work in detail, only a general description: The basic idea that all code paths where the value is nonzero should be deleted (because they will never be reached when an exception is thrown) and all code paths where the value is zero should be specialized for a zero value. | |||
=== Compensate and Propagate === | |||
The analysis conditions to detect the '''compensate and propagate''' pattern: (a) the call site is not an ignore or propagate, and (b) every path where <code>rv</code> is nonzero reaches ends by returning <code>rv</code>. | |||
Rewriting: TBD | |||
=== Compensate and Rethrow === | |||
The analysis conditions to detect the '''compensate and rethrow''' pattern: (a) the call site is not an ignore or propagate, and (b) every path where <code>rv</code> is nonzero reaches ends by returning a non-rv error code. | |||
Rewriting: TBD | |||
=== Compensate and Continue === | |||
and | The analysis conditions to detect the '''compensate and rethrow''' pattern: (a) the call site is not an ignore or propagate, and (b) every path where <code>rv</code> is nonzero eventually joins the paths where <code>rv</code> is zero. | ||
Rewriting: TBD | |||
edits