TestEngineering/Performance/Talos/RegressionBugsHandling: Difference between revisions

Jump to navigation Jump to search
Line 1: Line 1:
=Handling of Performance Regression Bugs=
=Handling of Performance Regression Bugs=
In order to better understand and track our Talos performance regressions, the performance team came up with the following system:
 
We use different policies for handling large and small regressions.
 
=Policy #1: Backouts within 48-hours for large regressions =
 
The perf team and the A-Team are testing out a new policy: backing out patches that cause significant Talos regressions on Windows builds within 48 hours.
 
Only regressions of 10% or more, on reliable Talos tests, on Windows, will face automatic backouts.
 
List of reliable Talos tests:
* Startup tests: ts_paint, sessionstore
* Scrolling smoothness tests: tp5o_scroll, tscrollx
* Other graphics performance tests: tresize, TART, tsvgx
* Page load test: tp5o
 
===== Why =====
We want to test a 48-hour backout policy because we noticed that patch authors tend not to address Talos regression bugs for days. If a regression sits in the tree for days, it becomes difficult to back it out, and it becomes much more likely the regression will end up riding the trains to release by default.
 
This new policy is more aggressive. We think a patch that regresses performance significantly should be backed out quickly, and re-landed when its performance is acceptable.
 
===== How do the 48-hour backouts work? =====
 
The A-Team perf sheriffs will create a Talos regression bug as soon as a regression is confirmed using Talos re-triggers. The patch author and reviewer will be CC’ed, and if they don’t provide an explanation for why the regression is acceptable, the patch will be backed out. The goal is to back out unjustified regressions within 48 hours of them landing. We’d like to give the patch author about 24 hours to reply after the regression bug is filed.
 
The A-Team has been working hard on improving the tools for understanding Talos regressions  (e.g. Perfherder + compare-talos), and we think debugging a Talos regression is a much less painful process these days. For example, there is now a highly useable view to visualize the comparison between a proposed patch against a baseline revision at https://treeherder.mozilla.org/perf.html#/comparechooser.
 
=Policy #2: Policy for smaller regressions =


===== When a Talos performance regression is detected and the patch which caused it is identified =====
===== When a Talos performance regression is detected and the patch which caused it is identified =====
Confirmed users
356

edits

Navigation menu