Data/WorkingGroups/CrashReporting/Status2021: Difference between revisions

Jump to navigation Jump to search
add june 4th 2021 update
(added may 7th update)
(add june 4th 2021 update)
Line 1: Line 1:
== Crash Reporting Status 2021 ==
== Crash Reporting Status 2021 ==


Every month, the coordinator will send out an email asking for status updates from teams/projects working on crash reporting things. The prompts:
Every month, the coordinator will send out an email asking for status updates from teams/projects working on crash reporting things. Details on this process are at [[Data/WorkingGroups/CrashReporting#Monthly_status_rollup]].


* What's your team/project?
Updates get compiled into a newsletter and sent to lists and posted here.
* What did you accomplish? (Descriptions, bug numbers, etc)
* What are you working on now or think you'll have done this month? (Descriptions, bug numbers, etc)
* What do you need help with?
* What are you concerned about?
* What else do you think is helpful for everyone to know?


Keep in mind this is a public list, so if you're working on confidential/NDA/security-sensitive things, this status update isn't the place for that.
== Crash Reporting Headlines (June 4th, 2021) ==


Updates will get compiled into a newsletter and sent to lists.
=== Quick Summary ===
 
* Acquired symbols for openSUSE 15.3
* Improving our support for acquiring macOS symbols for release and beta builds
* Work continues on WER support in crash reporter around hang reports and child processes
* Improved capturing annotations for OOM crashes and re-enabled grabbing memory reports
 
=== Details ===
 
==== Completed ====
 
* Crash Stats: fixed indexing for fields that don't show up in all crash reports
** https://bugzilla.mozilla.org/show_bug.cgi?id=1706171
** This fixes searching and aggregating with fields like phc_kind, adapter_driver_version, and others which don't show up in all crash reports so were periodically not included in the index.
* Crash Stats: fixed sort-by-address in signature report tables
** https://bugzilla.mozilla.org/show_bug.cgi?id=1032227
** We changed minidump-stackwalk to leftpad all memory address values. This makes them the same width and then they sort alphanumerically correctly. While doing this, I fixed some other sorting issues in signature report tables.
* Socorro: improved cpu_arch in processing
** https://bugzilla.mozilla.org/show_bug.cgi?id=1710854
** We improved processing so the cpu_arch field has a value for Fenix crash reports. If it can't find a value, then it sets cpu_arch to "unknown" rather than the empty string. This is a better value for searching and aggregating.
* Socorro: minidump-stackwalk print readable values for NTSTATUS or winerror.h results in Windows minidumps
** https://github.com/mozilla-services/minidump-stackwalk/issues/33
** https://github.com/mozilla-services/minidump-stackwalk/issues/26
* Tecken: support CORS preflight in Eliot for symbolication API
** https://bugzilla.mozilla.org/show_bug.cgi?id=1713667
** Added CORS preflight headers so that the new symbolication API on stage can be used by web apps.
* Crash reporter: temporarily changed Gecko to stop grabbing hang reports with WER
** https://bugzilla.mozilla.org/show_bug.cgi?id=1709423
* Crash Stats: display and support basic searching/aggregation of mac_crash_info data
** https://bugzilla.mozilla.org/show_bug.cgi?id=1709658
** Thank you, Steven Michaud!
* Crash reporter: re-enabled grabbing memory reports
** https://bugzilla.mozilla.org/show_bug.cgi?id=1712693
* Crash reporter: removed BIOS_Manufacturer and MemoryErrorCorrection crash annotations
** https://bugzilla.mozilla.org/show_bug.cgi?id=1710152
** The former was largely unused and we'll reintroduce the latter in a way that doesn't cause external code to be injected into Gecko.
* Crash reporter: improved mechanism for recording allocations that lead to OOM crashers
** https://bugzilla.mozilla.org/show_bug.cgi?id=1683288
** This makes sure almost all of the crash reports have the annotation properly populated.
* Symbols: started scraping debug information for openSUSE 15.3 builds
** https://bugzilla.mozilla.org/show_bug.cgi?id=1708662
* pdb-addr2line: crate published
** https://github.com/mstange/pdb-addr2line
** Lets you easily obtain function names, inline callstacks, and file + line information based on addresses from PDB files, similar to the Linux+macOS addr2line tool. I will be using the pdb-addr2line crate in the profiler. Part of the code in this crate was imported from the dump_syms code: The TypeFormatter helper is based on the dump_syms TypeDumper code.
 
==== In progress ====
 
* Crash reporter: intercepting child process crashes via WER
** https://bugzilla.mozilla.org/show_bug.cgi?id=1697895
** https://bugzilla.mozilla.org/show_bug.cgi?id=1682518
** Almost done with changes for intercepting child process crashes via Windows Error Reporting. This includes registering the runtime exception module with the child processes and adjusting it to inform the main process of the dumps it grabbed. This code hasn't landed yet though so it's still WIP-ish, but it works.
* All: Rust rewrite of all things breakpad
** rust minidump-stackwalk:
*** https://github.com/luser/rust-minidump/tree/master/minidump-stackwalk
*** https://github.com/luser/rust-minidump/issues/153
*** You can now install and test rust-minidump minidump-stackwalk
*** Same CLI as existing minidump-stackwalk that Socorro uses. Outputs the same JSON schema.
*** We handle most stuff reasonably well on x86/x64 these days, having full stackwalkers/symbolicators. ARM/ARM64 support is in progress. Some fields like exploitability heuristic are not yet implemented.
*** The biggest remaining task is replacing `breakpad-symbols` with `symbolic`, which should significantly improve performance/reliability of all the debuginfo handling.
*** Additionally, we've gotten a commitment from Microsoft to help build Rust minidump-stackwalk, maintain, and extend it.
* Tecken: new symbolication API microservice
** API url: https://symbolication.stage.mozaws.net/symbolicate/v5
** If you do any symbolication work, I'd love to know how it works for you and whether you encounter any issues.
** https://bugzilla.mozilla.org/show_bug.cgi?id=1636210
* Socorro: Better signature generation for Java crash reports
** https://bugzilla.mozilla.org/show_bug.cgi?id=1541120
* Symbols: improving process for acquiring symbols for macOS Big Sur
** https://bugzilla.mozilla.org/show_bug.cgi?id=1683758
** Currently we have symbols for release versions of macOS. Work is being done to acquire symbols for beta versions as well. Additionally, the process and tools for acquiring symbols for macOS Big Sur are being improved.
** This enables profiles collected on beta versions of macOS with the Firefox profiler to have symbolicated system libraries.
** This will improve stacks in crash reports for beta versions of macOS.
* Firefox profiler: fix OOM errors when profiling local builds on Linux
** https://bugzilla.mozilla.org/show_bug.cgi?id=1615066
* Firefox profiler: getting inline callstacks
** Making this work for official builds is a bigger lift and requires that our symbolication API stops using dump_syms as part of the symbolication pipeline. The Eliot rewrite is making big strides towards that goal and I am very excited about it. (Eliot currently goes [raw build artifact] -> [.sym file] -> [symbolic symcache] -> [API response]. Once we can go directly from the raw build artifact to the symbolic symcache, the rest should be easy.)


== Crash Reporting Headlines (May 7th, 2021) ==
== Crash Reporting Headlines (May 7th, 2021) ==
Confirmed users
537

edits

Navigation menu