Auto-tools/Projects/Signal From Noise/StatusNovember2012: Difference between revisions

(→‎State of Statistics: November 2012: formatting for clarity)
Line 64: Line 64:
* New method for regression detection: https://wiki.mozilla.org/images/d/dd/Talos_Statistical_Analysis_Writeup.pdf : Working with Datazilla results for tp5 test pages, Metrics developed a regression detection algorithm.To compare the mean of each page to of the new push to the mean of each page to the current push, hypothesis tests are conducted http://en.wikipedia.org/wiki/Statistical_hypothesis_testing. Welch's t-test is used to determine whether a page has regressed for the given new push. Moving to page-centric testing led to multiple hypothesis testing problem, and to correct for the inflation of false positives, False Discovery Rate Procedure (FDR) is used: http://www.stat.cmu.edu/~genovese/talks/hannover1-04.pdf.  Due to the natural variation between consecutive pushes, exponential smoothing was implemented before performing FDR procedure. Code for this is available in https://github.com/mozilla/datazilla-metrics
* New method for regression detection: https://wiki.mozilla.org/images/d/dd/Talos_Statistical_Analysis_Writeup.pdf : Working with Datazilla results for tp5 test pages, Metrics developed a regression detection algorithm.To compare the mean of each page to of the new push to the mean of each page to the current push, hypothesis tests are conducted http://en.wikipedia.org/wiki/Statistical_hypothesis_testing. Welch's t-test is used to determine whether a page has regressed for the given new push. Moving to page-centric testing led to multiple hypothesis testing problem, and to correct for the inflation of false positives, False Discovery Rate Procedure (FDR) is used: http://www.stat.cmu.edu/~genovese/talks/hannover1-04.pdf.  Due to the natural variation between consecutive pushes, exponential smoothing was implemented before performing FDR procedure. Code for this is available in https://github.com/mozilla/datazilla-metrics


* Datazilla utilizes improved statistical methodologies. Datazilla uses the welch's ttest, the FDR stuff, and the exponential smoothing.
Datazilla utilizes these improved statistical methodologies. Datazilla uses the welch's ttest, the FDR procedure, and the exponential smoothing. A datazilla-metrics repository, https://github.com/mozilla/datazilla-metrics ,  has been created, which is a python package that implements statistical methods useful for Datazilla.
 
A datazilla-metrics repository, https://github.com/mozilla/datazilla-metrics ,  has been created, which is a python package that implements statistical methods useful for Datazilla.


= Performance Testing Roadmap: 2013 =
= Performance Testing Roadmap: 2013 =
947

edits