Data/WorkingGroups/CrashReporting/Status2021: Difference between revisions
Jump to navigation
Jump to search
(add summary and april 7th 2021 status) |
(added may 7th update) |
||
Line 13: | Line 13: | ||
Updates will get compiled into a newsletter and sent to lists. | Updates will get compiled into a newsletter and sent to lists. | ||
== Crash Reporting Headlines (May 7th, 2021) == | |||
=== Quick Summary === | |||
* WER support in crash reporter | |||
** Work continues on WER support so that we get crash reports for situations we're currently not getting any. Main process support should be done. Content process support is in progress. | |||
* Socorro's minidump-stackwalker improvements | |||
** Socorro's minidump-stackwalker was improved to emit additional Windows and macOS information. You can see this in the minidump-stackwalk output in the crash report view of Crash Stats. | |||
* rust-minidump progress is moving along | |||
** Work towards replacing Socorro's minidump-stackwalker with rust-minidump is progressing very nicely. | |||
* Crash Stats lets you search by major_version | |||
* Crash Stats has an improved Extensions tab in the crash report view | |||
=== Details === | |||
==== Completed ==== | |||
* Crash reporter: WER support | |||
** Windows Error Reporting interception landed last month and can intercept all main process crashes we were previously missing. This includes __fastfail() crashes, catastrophic OOM crashes, weird DLL injections and very late shutdown crashes. It significantly increased nightly crash rate which is good! Content process support is being worked upon. | |||
* Socorro: minidump-stackwalker improved Windows information | |||
** minidump-stackwalker was improved to print out richer information for Windows including unloaded modules, authenticode signatures, __fastfail() crash reasons, and NTSTATUS errors. | |||
* Socorro: minidump-stackwalker __crash_info support for macOS | |||
** minidump-stackwalker was improved to find and emit __crash_info information for Apple-specific error messages. | |||
** Thank you, Steven Michaud! | |||
* Crash reporter: fixed OOM crash annotations | |||
** Alexandre modified the way we handle out-of-memory crash annotations so that it will never be missing again. | |||
* rust-minidump: taught rust-minidump to parse MISC_INFO_5 format | |||
** Taught rust-minidump to parse the MISC_INFO_5 format (and wrote tests/printing machinery for all the previous formats) | |||
** https://github.com/luser/rust-minidump/pull/137 | |||
* rust-minidump: upgraded minidump-processor unwinder | |||
** Upgraded the minidump-processor unwinder -- can now unwind with frame-pointers and scanning on x86 and x64 | |||
** https://github.com/luser/rust-minidump/pull/145 | |||
* rust-minidump: upgraded cli to match dump_syms | |||
** Upgraded the minidump-processor CLI frontend to match dump_syms, and taught it to generate a JSON version of its report (format is "whatever the layout of the current types are", to be iterated on over time) | |||
** https://github.com/luser/rust-minidump/pull/151 | |||
* dump_syms: better support for Apple's compact unwinding | |||
** Taught symbolic (and therefore dump_syms) how to dump Apple's Compact Unwinding (.__unwind_info) format into breakpad's format for x86/x64, as well as wrote up a very thorough description of the format (that is otherwise missing from llvm's implementation, which is the only existing documentation of the format). Ideally when this lands it will fix Bug 1691022 (x64 macos missing CFI on socorro). | |||
** https://github.com/getsentry/symbolic/pull/372/ | |||
* Crash Stats: added last error value to crash report view. | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=1692312 | |||
* Crash Stats: redid process type support | |||
** Redid process type support--now "parent" is the value for parent process crash reports and we're phasing out "browser". | |||
** This makes it a lot easier to search for parent crashes and aggregations on process type work now. | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=1701357 | |||
* Crash Stats: iImproved Extensions tab in report view | |||
** The Extensions tab now shows the extension name, whether it's a system extension, and its signed state. | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=1629943 | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=974968 | |||
* Crash Stats: fixed Bugs API to support POST as well as GET | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=1282707 | |||
** If there are other APIs that would benefit by having POST support, let me know. | |||
* Crash Stats: added search by major_version | |||
** Added a major_version field and the ability to search it. This works for all crash reports submitted after April 25th. | |||
** Now you can do searches like "major_version = 88" and "major_version >= 88 and major_version < 90" | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=1111612 | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=1401517 | |||
* Crash Stats: all Super Search fields now have exists/does-not-exist filter | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=973894 | |||
* Socorro: Added support for multiple processing pipeline rulesets | |||
** The first non-default ruleset I wrote is "regenerate_signature" which just regenerates the crash signature. It takes 1/10 the time regular processing takes. I'll use this going forward to regenerate crash signatures after signature generation changes. | |||
** We can use this infrastructure for additional processing as well. That's been something we've talked about over the years. | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=1705469 | |||
* Siggen: Released socorro-siggen 1.0.6 | |||
** This includes signature generation changes made since 1.0.5 as well as some minor fixes. | |||
* Presentation: Socorro Overview: 2021 | |||
** Converted Socorro Overview presentation done at Data Club into a blog post. | |||
** https://bluesock.org/~willkg/blog/mozilla/socorro_overview_2021.html | |||
==== In process ==== | |||
* All: Rust rewrite of all things breakpad | |||
** https://github.com/getsentry/symbolic/issues/375 | |||
* Tecken: new symbolication API microservice | |||
** API url: https://symbolication.stage.mozaws.net/symbolicate/v5 | |||
** If you do any symbolication work, I'd love to know how it works for you and whether you encounter any issues. | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=1636210 | |||
* Socorro: Better signature generation for Java crash reports | |||
** https://bugzilla.mozilla.org/show_bug.cgi?id=1541120 | |||
== Crash Reporting Headlines (April 7th, 2021) == | == Crash Reporting Headlines (April 7th, 2021) == |
Revision as of 15:11, 7 May 2021
Crash Reporting Status 2021
Every month, the coordinator will send out an email asking for status updates from teams/projects working on crash reporting things. The prompts:
- What's your team/project?
- What did you accomplish? (Descriptions, bug numbers, etc)
- What are you working on now or think you'll have done this month? (Descriptions, bug numbers, etc)
- What do you need help with?
- What are you concerned about?
- What else do you think is helpful for everyone to know?
Keep in mind this is a public list, so if you're working on confidential/NDA/security-sensitive things, this status update isn't the place for that.
Updates will get compiled into a newsletter and sent to lists.
Crash Reporting Headlines (May 7th, 2021)
Quick Summary
- WER support in crash reporter
- Work continues on WER support so that we get crash reports for situations we're currently not getting any. Main process support should be done. Content process support is in progress.
- Socorro's minidump-stackwalker improvements
- Socorro's minidump-stackwalker was improved to emit additional Windows and macOS information. You can see this in the minidump-stackwalk output in the crash report view of Crash Stats.
- rust-minidump progress is moving along
- Work towards replacing Socorro's minidump-stackwalker with rust-minidump is progressing very nicely.
- Crash Stats lets you search by major_version
- Crash Stats has an improved Extensions tab in the crash report view
Details
Completed
- Crash reporter: WER support
- Windows Error Reporting interception landed last month and can intercept all main process crashes we were previously missing. This includes __fastfail() crashes, catastrophic OOM crashes, weird DLL injections and very late shutdown crashes. It significantly increased nightly crash rate which is good! Content process support is being worked upon.
- Socorro: minidump-stackwalker improved Windows information
- minidump-stackwalker was improved to print out richer information for Windows including unloaded modules, authenticode signatures, __fastfail() crash reasons, and NTSTATUS errors.
- Socorro: minidump-stackwalker __crash_info support for macOS
- minidump-stackwalker was improved to find and emit __crash_info information for Apple-specific error messages.
- Thank you, Steven Michaud!
- Crash reporter: fixed OOM crash annotations
- Alexandre modified the way we handle out-of-memory crash annotations so that it will never be missing again.
- rust-minidump: taught rust-minidump to parse MISC_INFO_5 format
- Taught rust-minidump to parse the MISC_INFO_5 format (and wrote tests/printing machinery for all the previous formats)
- https://github.com/luser/rust-minidump/pull/137
- rust-minidump: upgraded minidump-processor unwinder
- Upgraded the minidump-processor unwinder -- can now unwind with frame-pointers and scanning on x86 and x64
- https://github.com/luser/rust-minidump/pull/145
- rust-minidump: upgraded cli to match dump_syms
- Upgraded the minidump-processor CLI frontend to match dump_syms, and taught it to generate a JSON version of its report (format is "whatever the layout of the current types are", to be iterated on over time)
- https://github.com/luser/rust-minidump/pull/151
- dump_syms: better support for Apple's compact unwinding
- Taught symbolic (and therefore dump_syms) how to dump Apple's Compact Unwinding (.__unwind_info) format into breakpad's format for x86/x64, as well as wrote up a very thorough description of the format (that is otherwise missing from llvm's implementation, which is the only existing documentation of the format). Ideally when this lands it will fix Bug 1691022 (x64 macos missing CFI on socorro).
- https://github.com/getsentry/symbolic/pull/372/
- Crash Stats: added last error value to crash report view.
- Crash Stats: redid process type support
- Redid process type support--now "parent" is the value for parent process crash reports and we're phasing out "browser".
- This makes it a lot easier to search for parent crashes and aggregations on process type work now.
- https://bugzilla.mozilla.org/show_bug.cgi?id=1701357
- Crash Stats: iImproved Extensions tab in report view
- The Extensions tab now shows the extension name, whether it's a system extension, and its signed state.
- https://bugzilla.mozilla.org/show_bug.cgi?id=1629943
- https://bugzilla.mozilla.org/show_bug.cgi?id=974968
- Crash Stats: fixed Bugs API to support POST as well as GET
- https://bugzilla.mozilla.org/show_bug.cgi?id=1282707
- If there are other APIs that would benefit by having POST support, let me know.
- Crash Stats: added search by major_version
- Added a major_version field and the ability to search it. This works for all crash reports submitted after April 25th.
- Now you can do searches like "major_version = 88" and "major_version >= 88 and major_version < 90"
- https://bugzilla.mozilla.org/show_bug.cgi?id=1111612
- https://bugzilla.mozilla.org/show_bug.cgi?id=1401517
- Crash Stats: all Super Search fields now have exists/does-not-exist filter
- Socorro: Added support for multiple processing pipeline rulesets
- The first non-default ruleset I wrote is "regenerate_signature" which just regenerates the crash signature. It takes 1/10 the time regular processing takes. I'll use this going forward to regenerate crash signatures after signature generation changes.
- We can use this infrastructure for additional processing as well. That's been something we've talked about over the years.
- https://bugzilla.mozilla.org/show_bug.cgi?id=1705469
- Siggen: Released socorro-siggen 1.0.6
- This includes signature generation changes made since 1.0.5 as well as some minor fixes.
- Presentation: Socorro Overview: 2021
- Converted Socorro Overview presentation done at Data Club into a blog post.
- https://bluesock.org/~willkg/blog/mozilla/socorro_overview_2021.html
In process
- All: Rust rewrite of all things breakpad
- Tecken: new symbolication API microservice
- API url: https://symbolication.stage.mozaws.net/symbolicate/v5
- If you do any symbolication work, I'd love to know how it works for you and whether you encounter any issues.
- https://bugzilla.mozilla.org/show_bug.cgi?id=1636210
- Socorro: Better signature generation for Java crash reports
Crash Reporting Headlines (April 7th, 2021)
Quick summary
- Started Crash Reporting Working Group
- We started a Crash Reporting Working Group to coordinate crash reporting, ingestion, and analysis work. If you're interested in participating or lurking, we've got a mailing list (crash-reporting-wg) and a Matrix channel (#crashreporting)
- Socorro: Ended collection of Email address data.
- Firefox 89+ no longer sends Email address data in crash reports.
- Email data is dropped at collection for all crash reports.
- Socorro: Ended collection of Fennec crash reports.
- Tecken: We need help testing new symbolication API microservice.
Details
Completed
- Crash Stats: Improved preview in Slack/Matrix for crash report view urls and signature report view urls.
- If there's more I can do with this to make these url previews more helpful in conversations, let me know.
- https://bugzilla.mozilla.org/show_bug.cgi?id=1688203
- Socorro: End collection of Email data in crash reports.
- I changed the collector to delete Email data for all incoming crash reports. I fixed the Firefox main and content crash reporter client code. I still have some changes to make in the webapp, but I'm waiting until May 2021 to do that.
- Many thanks to Emily, Nneka, Gabriele, Mike, and Chris for their help with this!
- https://bugzilla.mozilla.org/show_bug.cgi?id=1688883
- Socorro: End collection of crash reports for Fennec
- When working on ending collection of Email data, it came up that we don't need Fennec crash reports anymore. Thus Socorro now rejects all incoming crash reports for Fennec.
- Many thanks to Emily, Stefan, Vesta, and Agi!
- https://bugzilla.mozilla.org/show_bug.cgi?id=1699239
- Crash Stats: Fixed the webapp to automatically update the PCI device db once a week.
- Crash stats: Redid "Raw data and minidumps" tab in crash report view.
- The Crash Stats ui is confusing and clunky and I've been trying to fix bits of it over time. In this pass, I improved the tab that holds links to raw and processed crash data, minidumps, and the output of minidump-stackwalk. It should be clearer now as to what's protected data and what isn't. The links are at the top of the tab where they're easier to access. The minidump-stackwalk output is much easier to manipulate and use.
- https://bugzilla.mozilla.org/show_bug.cgi?id=1696910
- Tecken: New symbolication API microservices
- The symbolication API that Tecken has is hard to improve and there are a bunch of things we want to do with it. Because of that, I embarked on splitting it out into a separate microservice and rewriting it from the ground up using the Symbolic library. That's taken a while for a variety of reasons, but we've now got a working symbolication API in our staging environment that I think is usable.
- API url: https://symbolication.stage.mozaws.net/symbolicate/v5
- I need to write docs for it, but it uses the same payload as the existing symbolication API as documented here: https://tecken.readthedocs.io/en/latest/symbolication.html#symbolication-symbolicate-v5
- If you do any symbolication work, I'd love to know how it works for you and whether you encounter any issues.
- https://bugzilla.mozilla.org/show_bug.cgi?id=1636210
In progress
- Crash reporter client: integrating Windows Error Reporting into Firefox
- Tecken: Finishing up the new symbolication API microservice: https://bugzilla.mozilla.org/show_bug.cgi?id=1636210
- Socorro: Better signature generation for Java crash reports: https://bugzilla.mozilla.org/show_bug.cgi?id=1541120