Crash reporting improvements
Introduction
This page lists the various improvements that we want to introduce after having\ finished overhauling the existing crash reporting machinery (see the Crash reporting overhaul page for more information). Many of the tasks described here were features that had been requested years ago but could not be implemented in a reasonable amount of time using the old Breakpad-based tooling.
List of projects
Minidump storage for crash annotations
Status: not started
Developer(s):
Source code:
Original source code:
Bugs:
Description
Crash annotations are a set of pieces of information that accompany a minidump to form a complete crash report. Crash annotations contain critical information such as the Firefox version and build ID but also ancillary information such as how much memory a process was using, or a user-provided string associate with a failed assertion that crashed the process.
Currently crash annotations are stored in a JSON file (with an .extra suffix) that is sent along with the minidump to Socorro. Depending on the type of crash this file is either written out by the exception handler (if the main process crashed) or the contents of the annotations are forwarded to the main process which then writes them out (in the case of a child process crash).
Rationale
There are several issues with the current system:
- Having a separate file adds significant complexity both when submitting and processing crash reports, and also additional failure modes (like only one of the files being present in the report)
- The file needs to be written out after the minidump has been written out, adding complexity to the exception handler
- For child processes an extra IPC channel is needed to send the annotations
- Setting annotations is a relatively expensive process
- Some annotations are synthesized at crash time and dealt with ad-hoc code, there is no unified mechanism to handle them together with the others
Given the above storing the annotations within a minidump would simplify the crash reporting flow, eliminate an additional IPC channel and greatly streamline the effort to store annotations by user code.
Plan
Annotations should be stored within the minidump and read directly from the crashed process. This requires several steps:
- The crash annotations interface in Gecko needs to be modified so that a process can flag where its annotations are stored
- The crash-time annotations need to be removed and replaced with regular ones
- We need to add a mechanism to separate between the process' annotations and global ones that must be included in every crash
- Minidump writers need to be modified to identify where the annotations are stored in a process memory, read them and write them out within the minidump
- Finally teach the stackwalker tool to look for the annotations in the minidump and print them out
Additionally some changes will be required to Socorro on the ingestion side. Socorro currently relies on the .extra file contents for filtering. For example annotations containing the product version are used to decide if a crash is coming from a version of Firefox that is very old and thus should be dropped. If we store the annotations within the minidump we need to provide a way for Socorro to extract them without processing the full minidump, so that it can still apply its filtering rules. To this end we need to write a streamlined minidump pre-processor that only extracts this information and provides it in JSON format. This might prove useful for other types of filtering we don't currently do (such as rejecting reports caused by hardware faults or unconditionally accepting those that might indicate security-sensitive issues). The rust-minidump crate provides all the necessary functionality to write this tool.
Telemetry-based dashboards
Overview
Status: not started
Developer(s):
Source code:
Original source code: