CrashKill/Topcrash: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
(Correct the minimum installations threshold)
 
(6 intermediate revisions by 5 users not shown)
Line 1: Line 1:
The [https://bugzilla.mozilla.org/buglist.cgi?f1=keywords&list_id=8270969&o1=anywordssubstr&resolution=---&query_format=advanced&v1=topcrash list of topcrash bugs] is being kept up to date manually by the stability group by adding or removing the topcrash keyword on open (non-resolved) bugs according to the criteria below and the Top Crashers lists from [https://crash-stats.mozilla.com/ Mozilla Crash Stats].
The [https://bugzilla.mozilla.org/buglist.cgi?f1=keywords&list_id=8270969&o1=anywordssubstr&resolution=---&query_format=advanced&v1=topcrash list of topcrash bugs] is being kept up to date manually by the stability group and automatically by the [https://wiki.mozilla.org/Release_Management/autonag#Bugs_with_top-crash_signatures autonag bot] by adding or removing the topcrash keyword on open (non-resolved) bugs according to the criteria below and the Top Crashers lists from [https://crash-stats.mozilla.com/ Mozilla Crash Stats].


== Top crash identification criteria ==
== Top crash identification criteria ==
Line 10: Line 10:
** Top 20 desktop browser crashes on Betas
** Top 20 desktop browser crashes on Betas
*** This should be pretty much the same as release. Where we see discrepancies are really around 3rd party issues which are important to call out for blocking candidates.
*** This should be pretty much the same as release. Where we see discrepancies are really around 3rd party issues which are important to call out for blocking candidates.
** Top 10 desktop browser crashes on Aurora and Nightly, if they happen for enough different installations.
** Top 10 desktop browser crashes on Nightly, if they happen for enough different installations.
*** This might need some experience and feeling for what issues are important.
*** This might need some experience and feeling for what issues are important.
** Top 10 plugin crashes on Beta and Release
** Top 10 content process crashes on Beta and Release
*** Hang signatures are pretty generic and non-actionable right now (improvements are being worked on), the crashes in that list need to be looked at more carefully.
** Top 5 gpu process crashes on Beta and Release
** Top 5 desktop browser crashes on Linux-, Mac-, and Win8- specific list on Beta and Release
** Top 5 rdd process crashes on Beta and Release
*** If there's less than 5 crashes per week on a signature, that bug probably still doesn't qualify - same for crashes happening to only 2 or 3 installations.
** Top 5 socket and utility process crashes on Beta and Release but only if they affect 5+ installations.
** Top 5 desktop browser crashes on Linux-, Mac-, and Win10- specific list on Beta and Release
*** If volume is very similar to the top 5, other bugs might still be included.
*** If volume is very similar to the top 5, other bugs might still be included.
*** High volume Win7- and Win8- crashes should also be considered if they affect a significant number of users.
** Hangs of various kinds are not always actionable so they probably shouldn't be flagged as top crashers unless their impact is significant.


* MetroFirefox:
* Fenix:
** TBD
** Top 10 AArch64- and ARM-crashes for Nightly, Beta and Release
 
* Firefox for Android:  
** Top 10 ARMv7-crashes for Nightly, Aurora, Beta and Release
** Top 5 ARMv6- and x86- crashes for Beta and Release
** Until we see plugins and addons that highly effect the stability of Firefox for Android, it may make sense to just stay at 10 for all channels.
 
* B2G:
** Top 5 crashes inside of what we have symbols for and support. For FxOS 1.0 & 1.1, they can be found in https://crash-stats.mozilla.com/query/query?product=B2G&version=B2G:18.0&do_query=1
*** If there's less than 5 crashes per week on a signature, that bug probably still doesn't qualify - same for crashes happening to only 2 or 3 installations.
*** Also note that only crashes on actual devices can count there, simulator/desktop crashes do not count as topcrashes unless they make the product practically unusable.


* Thunderbird:
* Thunderbird:
** Top 25 for Release
** Top 25 for Release
** Top 3 for Beta   
** Top 3 for Beta, Nightly  
*** Focus only on worst of the worse.
*** Focus only on worst of the worse, and explosive crashes, and regressions because Beta and Nightly rankings don't correlate well** to final releases. So assigning topcrash status normally doesn't help significantly reduce release topcrashes. (** Probably because the respective user populations and environments are significantly different.)
*** Note Thunderbird Aurora and Beta top rankings, eg top 20, typically don't translate well to release versions. Likely related to the respective user populations. So assigning topcrash status to a beta crash is normally a worthless exercise.


* Everything:
* Everything:
Line 42: Line 34:
** Crashes for actions that users are rarely taking, even if they are somewhat out of the usual topcrash ranges
** Crashes for actions that users are rarely taking, even if they are somewhat out of the usual topcrash ranges
*** This needs feeling and expertise as well, thing like that can be e.g. printing crashes in the top 50 on desktop release and similar cases.
*** This needs feeling and expertise as well, thing like that can be e.g. printing crashes in the top 50 on desktop release and similar cases.
* Each release channel (i.e., Release, Beta, Nightly) should be considered separately. Combining crash reports from multiple channels, e.g., beta and release might hide beta-only top-crashes.
* If there's less than 15 crashes per week on a signature, that bug probably still doesn't qualify - same for crashes happening to less than 3 installations.
* Crash signatures that should not be automatically considered top crashers even if high volume:
** All signatures starting with `EMPTY: ` or `OOM | large | EMPTY: ` (examples: `EMPTY: no crashing thread identified; HeaderMismatch` and `OOM | large | EMPTY: no crashing thread identified; StreamSizeMismatch`)
** `OOM | small`
** `IPCError-browser | ShutDownKill`
** Signatures starting with `java.lang.OutOfMemoryError` on Android

Latest revision as of 21:18, 30 November 2022

The list of topcrash bugs is being kept up to date manually by the stability group and automatically by the autonag bot by adding or removing the topcrash keyword on open (non-resolved) bugs according to the criteria below and the Top Crashers lists from Mozilla Crash Stats.

Top crash identification criteria

  • Firefox:
    • Top 20 desktop browser crashes on the latest release (once it is over 10M ADI).
      • The 20-30 mark is where the numbers start to drop below 2000 crashes per week.
      • Also, in the past, many of the crashes in the 20-50 ranges have been repeats of other signatures in the top 50. It's not an exact science here but we think it's important to pick some bar.
      • Anything appearing in the 20-30 range that is marked as a start-up crash is also tagged as a top crash.
    • Top 20 desktop browser crashes on Betas
      • This should be pretty much the same as release. Where we see discrepancies are really around 3rd party issues which are important to call out for blocking candidates.
    • Top 10 desktop browser crashes on Nightly, if they happen for enough different installations.
      • This might need some experience and feeling for what issues are important.
    • Top 10 content process crashes on Beta and Release
    • Top 5 gpu process crashes on Beta and Release
    • Top 5 rdd process crashes on Beta and Release
    • Top 5 socket and utility process crashes on Beta and Release but only if they affect 5+ installations.
    • Top 5 desktop browser crashes on Linux-, Mac-, and Win10- specific list on Beta and Release
      • If volume is very similar to the top 5, other bugs might still be included.
      • High volume Win7- and Win8- crashes should also be considered if they affect a significant number of users.
    • Hangs of various kinds are not always actionable so they probably shouldn't be flagged as top crashers unless their impact is significant.
  • Fenix:
    • Top 10 AArch64- and ARM-crashes for Nightly, Beta and Release
  • Thunderbird:
    • Top 25 for Release
    • Top 3 for Beta, Nightly
      • Focus only on worst of the worse, and explosive crashes, and regressions because Beta and Nightly rankings don't correlate well** to final releases. So assigning topcrash status normally doesn't help significantly reduce release topcrashes. (** Probably because the respective user populations and environments are significantly different.)
  • Everything:
    • Bugs that spearhead investigation or fixes across a large collection of crashes
      • Judging this needs engineering expertise - if fixing a bug would clean out a number of crashes (with differing signatures) that would be in similar volume to signatures matching other topcrash criteria, that bug itself qualifies for topcrash as well.
    • Crashes for actions that users are rarely taking, even if they are somewhat out of the usual topcrash ranges
      • This needs feeling and expertise as well, thing like that can be e.g. printing crashes in the top 50 on desktop release and similar cases.
  • Each release channel (i.e., Release, Beta, Nightly) should be considered separately. Combining crash reports from multiple channels, e.g., beta and release might hide beta-only top-crashes.
  • If there's less than 15 crashes per week on a signature, that bug probably still doesn't qualify - same for crashes happening to less than 3 installations.
  • Crash signatures that should not be automatically considered top crashers even if high volume:
    • All signatures starting with `EMPTY: ` or `OOM | large | EMPTY: ` (examples: `EMPTY: no crashing thread identified; HeaderMismatch` and `OOM | large | EMPTY: no crashing thread identified; StreamSizeMismatch`)
    • `OOM | small`
    • `IPCError-browser | ShutDownKill`
    • Signatures starting with `java.lang.OutOfMemoryError` on Android