Auto-tools/Projects/Signal From Noise/Meetings/2012-05-17: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
 
(28 intermediate revisions by 2 users not shown)
Line 2: Line 2:
* find what pages we can consider reliable
* find what pages we can consider reliable
** define what "reliable" is
** define what "reliable" is
* triangulate
* triangulate
* [christina] - give jeads/ctalbert/jmaher a set of pages that are just too noisy and operating systems
* [christina] - give jeads/ctalbert/jmaher a set of pages that are just too noisy and operating systems
* local TBPL (contact: edmorley)
* local TBPL (contact: edmorley)


= Metric Calculations =
= Metric Calculations =
Line 19: Line 15:


= Round Table =
= Round Table =
* proposal for moving forward
 
** focus on tdhtml and tsvg as page centric
== ctalbert's official list of goals for SfN this quarter ==
*** find a model for pages that can detect regressions with a single new data point
* a production system slurping up data from talos deployed (staging env is fine since we know we won't get sec review done in time)
*** display data on graphs for investigation purposes
** Owner: Jeads
*** determine a metric to report for pass fail purposes and tracking as a page set (mochitests have pass/fail/todo, and if 1 test fails, we turn the job orange, maybe we do something similar here)
** Next steps: https://www.pivotaltracker.com/epic/show/107545
** if a page doesn't fit a model then lets not run it and file a bug to investigate the page (80/20 rule here)
 
** consider a maybe flag (probably orange) which doesn't match our model, but we cannot determine programatically that it is a regressions or measurable improvement.
* ability on this production system to accept data from non-talos projects
* new color for tbpl? (You might have regressed?)
** Owner: Jeads
* killing pages for network access: https://bugzilla.mozilla.org/show_bug.cgi?id=720852 (could just use datazilla)
** Next steps: https://www.pivotaltracker.com/epic/show/109709
** This goal might need some refinement.  We will be able to accept data across multiple projects.  Not sure if the UI will work out of the box for every project, also not sure if the webservice interface really meets all of the different project use cases.  I'm currently working on the stone ridge data, we'll see how it goes. 
 
* the UI we largely have now
** Owner: Jeads
** Next steps: https://www.pivotaltracker.com/epic/show/107671
** Putting UI work on hold until we finish goal number two completely.  Would rather deploy 100% on one goal and get it done right than spread too thin.  This task must be completed before release: https://www.pivotaltracker.com/story/show/28968601
 
* Metrics for determining page-centric "did my test pass or not" for at least one suite dromaeo and ts_paint
** Owner: Christina/Jeads
** Next steps: Iterate on the development data.  We're going to go ahead with the ~26 "good" pages in tp5.  It's going to be awhile before we have representative data for dromaeo and wading in on ts_paint could introduce a whole new set of variables that send us into the talos briar patch.
*** [jeads] Build SQL queries to pull out representative data for the ~26 "good" pages in tp5
*** [christina and jeads] Develop the input data set for analysis in R
*** [christina and jeads] Identify an outlier that is outside an allowable range
*** [christina] Develop an allowable range for a test result for each of the ~26 pages
*** [christina] Determine a score for a changeset across the entire test suite.  This could be as simple as 0 or 1 for each page, that gets added up across all pages and platforms. So 24/26 would indicate 24 pages were in the allowable range 2 were not for one platform.  We will not know the best way to do this until we wade into the data.
*** [everyone] Determine a way to convert the test suite score into a pass, fail, or undetermined result for a given changeset.
 
* a stage tbpl system mocking the Yes/No/Maybe interaction
** Owner: jmaher/jhammel
** Next steps: assert we can mock up tbpl to gather data post test completion, will mozilla be ok with that?
*** tbpl can do this: POST {'changeset': 31415926535, 'testname': 'dromaeo', 'platform': 'win7'} http://10.8.73.29/views/api/test_results
*** and recieve this: {'result':'PASS', 'tests': {'passed': 98, 'failed': 0, 'maybe': 2}, 'details': [{'k0s.org': {'actual': 2.71, 'expected': [2.75, 3.21]}, {'askjeads.com': {'actual': 99, 'expected': [88, 98]}]}
 
* We will have fixed the pages accessing web in tp5 and re-release the tp5 pageset with the new non-web touching pages
** Owner: jmaher/jhammel/chmanchester
** Next steps: write a cleanup script, run webpages through a cleanup script, verify the pages are clean, create new .zip file, talk to releng about side by side staging
 
* Convert all page cycle extension based tests to row-major style.
** Owner: jmaher
** Next steps: run tdhtml, tsvg, tsvg_opacity, tsspider, tscroll, a11y in staging, create graph server definitions for new test types, start side by side staging.


= Action Items =
= Action Items =

Latest revision as of 01:09, 17 May 2012

Previous Action Items

  • find what pages we can consider reliable
    • define what "reliable" is
  • triangulate
  • [christina] - give jeads/ctalbert/jmaher a set of pages that are just too noisy and operating systems
  • local TBPL (contact: edmorley)

Metric Calculations

Datazilla

Page Specific Views

Compare Talos Functionality

Round Table

ctalbert's official list of goals for SfN this quarter

  • ability on this production system to accept data from non-talos projects
    • Owner: Jeads
    • Next steps: https://www.pivotaltracker.com/epic/show/109709
    • This goal might need some refinement. We will be able to accept data across multiple projects. Not sure if the UI will work out of the box for every project, also not sure if the webservice interface really meets all of the different project use cases. I'm currently working on the stone ridge data, we'll see how it goes.
  • Metrics for determining page-centric "did my test pass or not" for at least one suite dromaeo and ts_paint
    • Owner: Christina/Jeads
    • Next steps: Iterate on the development data. We're going to go ahead with the ~26 "good" pages in tp5. It's going to be awhile before we have representative data for dromaeo and wading in on ts_paint could introduce a whole new set of variables that send us into the talos briar patch.
      • [jeads] Build SQL queries to pull out representative data for the ~26 "good" pages in tp5
      • [christina and jeads] Develop the input data set for analysis in R
      • [christina and jeads] Identify an outlier that is outside an allowable range
      • [christina] Develop an allowable range for a test result for each of the ~26 pages
      • [christina] Determine a score for a changeset across the entire test suite. This could be as simple as 0 or 1 for each page, that gets added up across all pages and platforms. So 24/26 would indicate 24 pages were in the allowable range 2 were not for one platform. We will not know the best way to do this until we wade into the data.
      • [everyone] Determine a way to convert the test suite score into a pass, fail, or undetermined result for a given changeset.
  • a stage tbpl system mocking the Yes/No/Maybe interaction
    • Owner: jmaher/jhammel
    • Next steps: assert we can mock up tbpl to gather data post test completion, will mozilla be ok with that?
      • tbpl can do this: POST {'changeset': 31415926535, 'testname': 'dromaeo', 'platform': 'win7'} http://10.8.73.29/views/api/test_results
      • and recieve this: {'result':'PASS', 'tests': {'passed': 98, 'failed': 0, 'maybe': 2}, 'details': [{'k0s.org': {'actual': 2.71, 'expected': [2.75, 3.21]}, {'askjeads.com': {'actual': 99, 'expected': [88, 98]}]}
  • We will have fixed the pages accessing web in tp5 and re-release the tp5 pageset with the new non-web touching pages
    • Owner: jmaher/jhammel/chmanchester
    • Next steps: write a cleanup script, run webpages through a cleanup script, verify the pages are clean, create new .zip file, talk to releng about side by side staging
  • Convert all page cycle extension based tests to row-major style.
    • Owner: jmaher
    • Next steps: run tdhtml, tsvg, tsvg_opacity, tsspider, tscroll, a11y in staging, create graph server definitions for new test types, start side by side staging.

Action Items