Auto-tools/Meetings/2015-01-26: Difference between revisions

bug 1107336
(DevTools harness update)
(bug 1107336)
 
(31 intermediate revisions by 18 users not shown)
Line 1: Line 1:
= Notices, Highlights, Roundtable =
= Notices, Highlights, Roundtable =
* Significant Contributions
* Significant Contributions
* [jmaher] A-Team [[https://etherpad.mozilla.org/ateam-community-day community day]] - Tommorrow!
** hack a bit on our tasks
** expect to have folks help out with docs/etc.
** the bootcamp


= Newsgroup and Blog Posts =
= Newsgroup and Blog Posts =
* [ahal] http://ahal.ca/blog/2015/new-mercurial-workflow-part-2/


= Goal Updates =
= Goal Updates =
Line 16: Line 22:
* bugs: {{bug|906712}}, {{bug|1109183}}, {{bug|1107336}}
* bugs: {{bug|906712}}, {{bug|1109183}}, {{bug|1107336}}
* '''progress since last update''':
* '''progress since last update''':
** {{bug|906712}} Largely busy with other tasks, but have incorporated feedback on the api and requested review on the patch.
** {{bug|1107336}} - WPTRunner patch created and new try pushes after it landed and was backed out.


=== Support the conversion of targeted P1 mozmill tests to Marionette and get them running in CI [hskupin] ===
=== Support the conversion of targeted P1 mozmill tests to Marionette and get them running in CI [hskupin] ===
Line 22: Line 30:
* stretch goal: support the conversion of Search tests
* stretch goal: support the conversion of Search tests
* '''progress since last update''':
* '''progress since last update''':
** Still being blocked by some bugs in Marionette, but workarounds work pretty well so far
** Currently working on chrome window handling code. WIP up on https://github.com/mozilla/firefox-ui-tests/pull/50, and reviewable patch should follow later today or early tomorrow.
** Next is the implementation of tab handling
** Barbara started to work on investigating the default preferences for Marionette ({{bug|1123683}})
** Chris is working on getting the location bar ui module implemented
** Chris is preparing training materials for the training next week


=== Resolve P1 bugs blocking the release of Marionette 1.0 [AutomatedTester, ato, jgraham] ===
=== Resolve P1 bugs blocking the release of Marionette 1.0 [AutomatedTester, ato, jgraham] ===
Line 28: Line 42:
* '''progress since last update''':
* '''progress since last update''':
** [ato] Bug 1107706 is nearly done, but spending much time and frustration on B2G.
** [ato] Bug 1107706 is nearly done, but spending much time and frustration on B2G.
** [jgraham] Marionette-independent code split out of proxy into its own library, but mostly working on other goals.
** [AutomatedTester] {{Bug|912715}} landed that fixes inconsistency between findElement and findElements


===''Supporting Tasks''===
===''Supporting Tasks''===
Line 58: Line 74:
=== Ingest all Talos data with Treeherder and develop a UI that can be used to view current and historical data [wlach] ===
=== Ingest all Talos data with Treeherder and develop a UI that can be used to view current and historical data [wlach] ===
* details: We want to use Treeherder to store and display Talos performance data, because Datazilla is being deprecated and Graphserver doesn't support the kind of performance analysis we'd like to perform on the data.
* details: We want to use Treeherder to store and display Talos performance data, because Datazilla is being deprecated and Graphserver doesn't support the kind of performance analysis we'd like to perform on the data.
* '''progress since last update''':
* '''progress since last update''': Really starting to come together. Initial pull request filed (https://github.com/mozilla/treeherder-ui/pull/289), should be landing pending resolution of mdoglio's requests. Lots of polish / refinement required but we're getting there. Screenshot: http://people.mozilla.org/~wlachance/treeherder-graphs-2014-01-21.png


== Treeherder ==
== Treeherder ==
Line 65: Line 81:
* details: Tier 1 and Tier 2 jobs will have different sheriffing guidelines and different expectations.  Accordingly, we need to display them and allow users to interact with them differently.
* details: Tier 1 and Tier 2 jobs will have different sheriffing guidelines and different expectations.  Accordingly, we need to display them and allow users to interact with them differently.
* bug: {{bug|1113322}}
* bug: {{bug|1113322}}
* '''progress since last update''':
* '''progress since last update''': {{bug|1097090}} is now under review. No much progress because I have been working on Mozreview last week.


=== Develop a prototype structured log viewer [camd] ===
=== Develop a prototype structured log viewer [camd] ===
Line 71: Line 87:
* bug: {{bug|1113873}}
* bug: {{bug|1113873}}
* '''progress since last update''':
* '''progress since last update''':
** Didn't get a chance to work on this in this period.  No new progress.


=== Develop a minimal UI that sheriffs can use to file new intermittents [edmorley] ===
=== Develop a minimal UI that sheriffs can use to file new intermittents [edmorley] ===
* details: develop a usable but minimal UI that sheriffs can use to quickly file new intermittent issues; the UI should automatically fill out some of the details that sheriffs normally have to find manually.  Future iterations can improve on this by auto-filling more fields.
* details: develop a usable but minimal UI that sheriffs can use to quickly file new intermittent issues; the UI should automatically fill out some of the details that sheriffs normally have to find manually.  Future iterations can improve on this by auto-filling more fields.
* bug: {{bug|1117583}}
* bug: {{bug|1117583}}
* '''progress since last update''':
* '''progress since last update''': None - worked on supporting tasks below (and to a much lesser extent, my other personal deliverable). Infra/reliability & other issues still remain that are higher priority than new features.


===''Supporting Tasks''===
===''Supporting Tasks''===
* Continue to improve the performance and operational aspects of the system
* Continue to improve the performance and operational aspects of the system
** Dealt with tree-closing issues due to DB locks
** Fixing the fact that we never used the DB read host (awaiting testing/deployment)
** During the tree closure debugging also discovered jobs could get stuck during ingestion in the 'loading' state, bug filed
** Fixed log parsing exceptions seen on production
** Limited max number of revisions ingested to avoid timeouts with merge day pushes
** Reduced data lifecycle from 5 to 4 months to alleviate shortage of DB disk space & discovered performance artefacts were not being expired (PR ready to land)
* Identify and resolve remaining issues blocking TBPL EOL - treeherder parts: {{bug|1059400}}, other parts: {{bug|1054977}}
* Identify and resolve remaining issues blocking TBPL EOL - treeherder parts: {{bug|1059400}}, other parts: {{bug|1054977}}
** Survey emailed out: [https://groups.google.com/forum/#!topic/mozilla.dev.tree-management/CtkSFWbtAd0 newsgroup post]
** Several developer reported issues fixed (correct timestamp in TBPLbot comments, missing TinderboxPrints, Persona login for emails longer than N characters, reduced target_blank usage, missing bugs suggestions, talos panel for e10s talos, broken "push not processed" refresh, filtering by author on Try)


===''Backlog''===
===''Backlog''===
Line 97: Line 122:
* bug: {{bug|1051056}}
* bug: {{bug|1051056}}
* '''progress since last update''':
* '''progress since last update''':
** Made some progress last week after the upstream 5.0rc1 release went out but got called back to do another spin due to a missed error. Hopefully finishing the respin today/tomorrow and then back full time on the goal. Should hopefully have a working prototype by end of week.


=== GitHub authentication [dylan] ===
=== GitHub authentication [dylan] ===
Line 102: Line 128:
* bug: {{bug|1118365}}
* bug: {{bug|1118365}}
* '''progress since last update''':
* '''progress since last update''':
I've been able to pass auth tokens between github and my dev server. There are some questions about mechanics of how this
authentication ties into Bugzilla accounts that need to be discussed at the next BMO team meeting (Tuesday).


== DevTools Harness ==
== DevTools Harness ==
Line 108: Line 137:
* details: Take the prototype that was developed in 2014 Q4 (https://github.com/luser/luciddream) and get it running in continuous integration and visible in Treeherder. It's TBD whether this will be run in buildbot or TaskCluster, but we should get it running somewhere per-commit this quarter against linux desktop Firefox and a B2G emulator.
* details: Take the prototype that was developed in 2014 Q4 (https://github.com/luser/luciddream) and get it running in continuous integration and visible in Treeherder. It's TBD whether this will be run in buildbot or TaskCluster, but we should get it running somewhere per-commit this quarter against linux desktop Firefox and a B2G emulator.
* '''progress since last update''':
* '''progress since last update''':
* Met with James Lal to talk about TaskCluster, looks viable for running this harness in CI. Did some work on [https://gist.github.com/luser/c1e948eb16d43e6863bf a mozharness script] to run the Luciddream harness.
** Met with James Lal to talk about TaskCluster, looks viable for running this harness in CI. Did some work on [https://gist.github.com/luser/c1e948eb16d43e6863bf a mozharness script] to run the Luciddream harness.
* This goal is going to get postponed for me to work on another project.
** This goal is going to get postponed for me to work on another project.


== CloudServices Automation ==
== CloudServices Automation ==
Line 121: Line 150:
* details: Define and document all aspects of Tier 2 jobs: rationale, criteria, necessary enhancements to Treeherder and automation frameworks.
* details: Define and document all aspects of Tier 2 jobs: rationale, criteria, necessary enhancements to Treeherder and automation frameworks.
* bug: {{bug|1121655}}
* bug: {{bug|1121655}}
* '''progress since last update''':
* '''progress since last update''': None. I've completed my 'must-do' items for Autophone and will focus on this goal this week.


=== Prototype a retrigger-based bisection tool [armenzg, jmaher] ===
=== Prototype a retrigger-based bisection tool [armenzg, jmaher] ===
* details: Create a prototype of a command-line tool that can be used by sheriffs and others to automate retrigger-based bisection.  This could be used to help bisect new intermittent oranges, and to backfill jobs that have been skipped due to coalescing.  Integration with Treeherder or other service will be done later.
* details: Create a prototype of a command-line tool that can be used by sheriffs and others to automate retrigger-based bisection.  This could be used to help bisect new intermittent oranges, and to backfill jobs that have been skipped due to coalescing.  Integration with Treeherder or other service will be done later.
* Project repo: https://github.com/armenzg/mozilla_ci_tools
* '''progress since last update''':
* '''progress since last update''':
** I've done the first release of the library that allows us to trigger jobs in automation.
** We will have a second release this week in preparation for the bisection tool


===  Store high-resolution testcase data ("ActiveData") [ekyle, ahal] ===
===  Store high-resolution testcase data ("ActiveData") [ekyle, ahal] ===
* details: Create a Proof of Concept “big data” project which will store information about every test file we run:  test status, error details, test machine and test duration to begin with.  We will use this project to develop schemas and queries that work with data this large, and we will use this data to normalize chunk sizes and provide details about which tests never fail.
* details: Create a Proof of Concept “big data” project which will store information about every test file we run:  test status, error details, test machine and test duration to begin with.  We will use this project to develop schemas and queries that work with data this large, and we will use this data to normalize chunk sizes and provide details about which tests never fail.
* '''progress since last update''':
* '''progress since last update''':
** [https://github.com/klahnakoski/TestLog-ETL/tree/518fb3766f5de4ab6a4fecef28cdbb55066666e6 First version ETL ]is working well enough.  It has some long term stability issues, and will inevitably need improvements once we start charting the results.
** ahal's [https://github.com/mozilla/structured-catalog structured catalog] worker management code will help with the stability issues found in my ETL code
** Single node 'cluster' being filled with unittest data - current concerns are price [http://aws.amazon.com/ec2/pricing/ pricing chart says $0.40/hour, or almost $300/month], so looking into using spot instances for the other nodes.  The single node is weak: It will OoM under full load.  If it does not crash, then it slows to a crawl: My current guess is the  EBS volumes are too slow and acting as a bottleneck
* '''Next Steps'''
** Hopefully [http://www.fabfile.org/ Fabric] along with [https://github.com/boto/boto boto] will make setting up EC2 instances for both the ES cluster and the ETL daemons easy
** Integrate the ETL code into structured catalog
** Add front-end to ES cluster [https://github.com/klahnakoski/esFrontLine like esFrontLine] that protects cluster from updates, provides logging, and (hopefully) provides a simpler query interface.


===  Implement the ability to normalize chunk durations in mochitest [ahal] ===
===  Implement the ability to normalize chunk durations in mochitest [ahal] ===
* details: For mochitest variants on desktop and B2G, modify manifestparser and the test harnesses to be able to specify which tests are run in specific chunks.
* details: For mochitest variants on desktop and B2G, modify manifestparser and the test harnesses to be able to specify which tests are run in specific chunks.
* stretch goal: Implement the same feature for Android mochitest, which still uses old-style JSON manifests.
* stretch goal: Implement the same feature for Android mochitest, which still uses old-style JSON manifests.
* bug: {{bug|1054247}}
* bug: {{bug|1124182}}
* '''progress since last update''':
* '''progress since last update''':
** refactored manifestparser into different files (landed)
** implemented filtering system for manifestparser (up for review)
** started work porting chunking algorithms to python (in progress)


=== Create Android 4.4 emulator image for automated tests [gbrown] ===
=== Create Android 4.4 emulator image for automated tests [gbrown] ===
Line 145: Line 187:
** "greening" of tests
** "greening" of tests
* '''progress since last update''':
* '''progress since last update''':
** {{bug|1123443}} Allow android emulator tests to use adb devicemanager (reviewed)
** {{bug|1124913}} Allow android emulator tests to download emulator (reviewed)
** working on setting up a try push with adb + new emulator + new avds


=== Help Releng reduce test load [jmaher] ===
=== Help Releng reduce test load [jmaher] ===
Line 182: Line 227:
== Community ==
== Community ==


<sub>Subscript text</sub>=== Increase 'contributor friendliness' of our projects [jmaher, all] ===
=== Increase 'contributor friendliness' of our projects [jmaher, all] ===
* details: Ensure that all ongoing projects have a friendliness rating of at least 6, as shown on https://wiki.mozilla.org/Auto-tools/Projects/Everything
* details: Ensure that all ongoing projects have a friendliness rating of at least 6, as shown on https://wiki.mozilla.org/Auto-tools/Projects/Everything
* '''progress since last update''':
* '''progress since last update''':
Line 192: Line 237:


= Other Project Updates =
= Other Project Updates =
== charts.mozilla.org ==
'''Update'''
* attempted a few versions of dashboard to visualize/manage releases for FxOS devices (engineering project management and release management team).
'''Next Step'''
* use a variation on the Platform dashboard
== Alerts ==
'''Update'''
* Updated dzAlerts to digest the options_collection_hash (pgo/debug) and e10s.  Currently running on public dev sever at home.
'''Next Step'''
* Put Talos data in Active Data ES Cluster: dzAlerts needed a home for a cluster for a while.


= Holidays and Trips =
= Holidays and Trips =
* [mcote] on PTO starting Jan 28. Back Feb 8.


= Misc =
= Misc =
Confirmed users
753

edits