TestEngineering/Performance/Talos/RegressionBugsHandling: Difference between revisions

TestEngineering/Performance/Talos/RegressionBugsHandling (view source)

Revision as of 21:19, 13 August 2015

1,919 bytes added , 13 August 2015

→‎Handling of Performance Regression Bugs

Vladan

Confirmed users

356

edits

@@ Line 1: / Line 1: @@
 =Handling of Performance Regression Bugs=
-In order to better understand and track our Talos performance regressions, the performance team came up with the following system:
+We use different policies for handling large and small regressions.
+=Policy #1: Backouts within 48-hours for large regressions =
+The perf team and the A-Team are testing out a new policy: backing out patches that cause significant Talos regressions on Windows builds within 48 hours.
+Only regressions of 10% or more, on reliable Talos tests, on Windows, will face automatic backouts.
+List of reliable Talos tests:
+* Startup tests: ts_paint, sessionstore
+* Scrolling smoothness tests: tp5o_scroll, tscrollx
+* Other graphics performance tests: tresize, TART, tsvgx
+* Page load test: tp5o
+===== Why =====
+We want to test a 48-hour backout policy because we noticed that patch authors tend not to address Talos regression bugs for days. If a regression sits in the tree for days, it becomes difficult to back it out, and it becomes much more likely the regression will end up riding the trains to release by default.
+This new policy is more aggressive. We think a patch that regresses performance significantly should be backed out quickly, and re-landed when its performance is acceptable.
+===== How do the 48-hour backouts work? =====
+The A-Team perf sheriffs will create a Talos regression bug as soon as a regression is confirmed using Talos re-triggers. The patch author and reviewer will be CC’ed, and if they don’t provide an explanation for why the regression is acceptable, the patch will be backed out. The goal is to back out unjustified regressions within 48 hours of them landing. We’d like to give the patch author about 24 hours to reply after the regression bug is filed.
+The A-Team has been working hard on improving the tools for understanding Talos regressions  (e.g. Perfherder + compare-talos), and we think debugging a Talos regression is a much less painful process these days. For example, there is now a highly useable view to visualize the comparison between a proposed patch against a baseline revision at https://treeherder.mozilla.org/perf.html#/comparechooser.
+=Policy #2: Policy for smaller regressions =
 ===== When a Talos performance regression is detected and the patch which caused it is identified =====

TestEngineering/Performance/Talos/RegressionBugsHandling: Difference between revisions

TestEngineering/Performance/Talos/RegressionBugsHandling (view source)

Revision as of 21:19, 13 August 2015

Navigation menu

Search