Confirmed users
387
edits
(Stack overflow improvements) |
(Marked the "Supported inlined functions in crash stacks" project as completed) |
||
(6 intermediate revisions by the same user not shown) | |||
Line 90: | Line 90: | ||
=== Overview === | === Overview === | ||
Status: | Status: completed<br> | ||
Developer(s):<br> | Developer(s): cmartin<br> | ||
Source code: https://github.com/rust-minidump/rust-minidump<br> | Source code: https://github.com/rust-minidump/rust-minidump<br> | ||
Original source code: N/A<br> | Original source code: N/A<br> | ||
Line 108: | Line 108: | ||
disassemble the crashing instruction and be able to inspect it: | disassemble the crashing instruction and be able to inspect it: | ||
* For non-canonical addresses we could reconstruct the real crashing address from the registers and immediate values in the instructions | * For non-canonical addresses we could reconstruct the real crashing address from the registers and immediate values in the instructions | ||
* For misaligned vector accesses we could reconstruct the real crashing address from the registers and immediate values in the instructions | |||
* For invalid instructions we could tell if the instruction is valid and non-supported or downright invalid (in the case of a bit-flip or corrupted executable for example) | * For invalid instructions we could tell if the instruction is valid and non-supported or downright invalid (in the case of a bit-flip or corrupted executable for example) | ||
* For privileged or unsupported instructions we'd be able to tell if it's our fault or if the machine configuration is not adequate | * For privileged or unsupported instructions we'd be able to tell if it's our fault or if the machine configuration is not adequate | ||
Line 127: | Line 128: | ||
=== Overview === | === Overview === | ||
Status: | Status: completed<br> | ||
Developer(s):<br> | Developer(s): mstange<br> | ||
Source code:<br> | Source code:<br> | ||
* https://github.com/mozilla/dump_syms/pull/392 | |||
Original source code: N/A<br> | Original source code: N/A<br> | ||
Bugs:<br> | Bugs:<br> | ||
Line 137: | Line 139: | ||
=== Description === | === Description === | ||
For a long time Breakpad symbol files only included names and information for | |||
non-inlined functions. This was recently changed and now symbols files can | |||
include the name of inlined function as well as the regions of memory where | |||
they were inlined, complete with indexes to discern at what level of the stack | |||
they appeared. | |||
=== Rationale === | === Rationale === | ||
Firefox code includes heavy inlining, especially in layered Rust and C++ code. | |||
The lack of inline information has hampered us, often making interpreting | |||
crashes very much non-obvious. Adding support for inlined functions would make | |||
it easier to diagnose bugs and would significantly simplify triage of certain | |||
modules. | |||
=== Plan === | === Plan === | ||
* The first step is to introduce support in Symbolic to correctly parse these fields while reading .sym files. This will be used to later add support in the stack walker | |||
* Once Symbolic support is ready dump_syms needs to be modified to emit these directives. Symbolic already supports reading inlining information from native debuginfo so it's a matter of leveraging that information | |||
* Finally the stack walker needs to be modified to take into account the new directives and emit inlined frames in the output | |||
== Improved stack overflow detection & analysis == | == Improved stack overflow detection & analysis == | ||
Line 146: | Line 164: | ||
=== Overview === | === Overview === | ||
Status: | Status: in progress<br> | ||
Developer(s): | Developer(s): gsvelto<br> | ||
Bugs:<br> | Bugs:<br> | ||
* {{ | * {{bug|1671082}} | ||
* {{bug|1678152}} | |||
* {{bug|1758673}} | |||
* {{bug|1768794}} | |||
* minidump-writer issue [https://github.com/mozilla/minidump-writer/issues/24 #24] | |||
=== Description === | === Description === | ||
* | For years we've assumed that stack overflows would be captured by the Breakpad | ||
* | exception handlers; this assumption was based on the presence of crash reports | ||
* | involving stack overflows on Windows, the use of an alternate signal stack on | ||
Linux and macOS' exception handler architecture which delegates exceptions to a | |||
separate thread. Real-world testing and bugs proved that we were actually | |||
missing a significant amount of stack overflows: | |||
* On Linux the alternate signal stack was only available on the main thread, stack overflows in other threads wouldn't be caught | |||
* When we did catch a stack overflow on Linux the minidump writer might mistake the guard page for the stack, thus storing an empty stack in the generated minidump | |||
* On Windows only some stack overflow crashes were caught, others would be silently forwarded to Windows Error Reporting | |||
* On macOS the exception handler seems capable of catching the overflow but the minidump writer produces a malformed minidump which is completely unusable | |||
* Crash reports caused by stack overflows are obvious on Windows which has a specific exception for them, but on macOS/Linux they're indistinguishable from other crashes | |||
=== Plan === | === Plan === | ||
This project requires tackling several issues: | |||
* On Linux we need to ensure all threads have an alternate signal stack installed when they're launched and we need to modify the minidump writer to properly identify where the stack is | |||
* On macOS we need to investigate the issues with minidump writing, possibly integrating the required changes in the oxidized minidump writer | |||
* On Windows we need to ensure that the Windows Error Reporting interceptor catches stack overflows | |||
* We need to introduce a test that specifically checks for crash overflows and ensures that they're being caught properly, then enable it one platform at a time | |||
* Last but not least we need to flag macOS/Linux stack overflows so that they're easy to tell apart from other type of regular crashes |