Auto-tools/Projects/OrangeFactor/2010-11-03: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(Created page with "== War on Orange Status Meeting, Nov 3, 2010 == Anurag and Daniel joined us today, and we discussed the eventual setup w.r.t. logs and the database. == Log processing == A set...")
 
m (Edmorley moved page Auto-tools/Projects/WarOnOrange/2010-11-03 to Auto-tools/Projects/OrangeFactor/2010-11-03: Renaming to the actual name of the webapp)
 
(2 intermediate revisions by 2 users not shown)
Line 7: Line 7:
A setup for handling the logs at different stages of processing was proposed as follows:
A setup for handling the logs at different stages of processing was proposed as follows:


1 - a script saves new buildbot logs to a directory, say 'incoming'
1 - a script saves new buildbot logs to a directory, say 'incoming_sendjson'


2 - another script watches for files to appear in 'incoming', then copies them to another directory called 'parse'The log parser is invoked for this log (by our script? or flume?), and the output sent to stdout, which somehow flume will be watching, and which it will transport to the Hive DBFinally, the log is moved to another directory, 'sendlog'.
2 - A Flume pipeline uses one of the exec sources [http://archive.cloudera.com/cdh/3/flume/UserGuide.html#_flume_source_catalog] to execute a script that invokes the log parser script on batches of buildbot logs sitting in the 'incoming_sendjson' directory.  For each one of those files, the script will emit single line JSON objects to stdout.  When the Flume agent invokes this script, it will store each line of stdout into the Hive table for JSON resultsThe script should move each successfully processed buildbot log file from 'incoming_sendjson' to 'incoming_sendlog'


3 - another script watches for files to appear in 'sendlog', and outputs the entire (uncompressed) log to stdout, with the following format (' \t ' represents a tab in the output):
3 - Another Flume pipeline will exec a separate script that will process batches of buildlogs sitting in the 'incoming_sendlog'.  This script should cat each log file, prepending tab separated metadata in the following format (' \t ' represents a tab in the output):
  repo \t platform \t debug-or-opt \t builddate \t testsuite \t line-num \t single-log-line
'''Example buildbot log input'''
  mozilla-central-macosx64-debug/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 1
  mozilla-central-macosx64-debug/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 2
  mozilla-central-macosx64-debug/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 3
  mozilla-central-macosx64-release/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 1
  mozilla-central-macosx64-release/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 2
  mozilla-central-macosx64-release/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 3
'''Example sendlog script stdout output'''
  mozilla-central \t macosx64 \t debug \t 1288731175 \t crashtest \t 1 \t line 1
  mozilla-central \t macosx64 \t debug \t 1288731175 \t crashtest \t 2 \t line 2
  mozilla-central \t macosx64 \t debug \t 1288731175 \t crashtest \t 3 \t line 3
  mozilla-central \t macosx64 \t release \t 1288731175 \t crashtest \t 1 \t line 1
  mozilla-central \t macosx64 \t release \t 1288731175 \t crashtest \t 2 \t line 2
  mozilla-central \t macosx64 \t release \t 1288731175 \t crashtest \t 3 \t line 3


repo \t platform \t debug-or-opt \t builddate \t testsuite \t [loglines]
After catting, each log is moved to a 'processed' directory.
 
where [loglines] is a JSON array of log lines (one log line per element in the array)
 
Afterwards, the log is moved to a 'processed' directory.
 
The 'repo', 'platform', 'debug|opt', 'builddate' and 'suitename' properties above are generated from the original location of the log on http://stage.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds, e.g., the output for
 
mozilla-central-macosx64-debug/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz
 
would be:
 
mozilla-central    macosx64    debug    1288731175    crashtest    [loglines]


Each of the these properties will wind up being columns in the logdata database, so we should be able associate these logs with the parsed log data.  If the parsed log data includes indexes into these logs, then we'll be able to retrieve a portion of the log that represents the execution of any individual test.
Each of the these properties will wind up being columns in the logdata database, so we should be able associate these logs with the parsed log data.  If the parsed log data includes indexes into these logs, then we'll be able to retrieve a portion of the log that represents the execution of any individual test.


== Orange Factor ==
== Orange Factor ==
Jeff has added a 'Testing: Orange Factor' component in bugzilla, and plans on filing a lot of bugs with ideas for improving the site.


Joel has moved the website to brasstacks, and it has increased in speed mightily!  He's changed the main display to show a daily orange factor calculation, rather than a cumulative average.  Jeff has some ideas of how to apply some smoothing algorithms to this so we can reduce the spikiness of the graph. Mark plans to update the page titles to be distinct.
Joel has moved the website to brasstacks, and it has increased in speed mightily!  He's changed the main display to show a daily orange factor calculation, rather than a cumulative average.  Jeff has some ideas of how to apply some smoothing algorithms to this so we can reduce the spikiness of the graph. Mark plans to update the page titles to be distinct.

Latest revision as of 13:33, 23 October 2014

War on Orange Status Meeting, Nov 3, 2010

Anurag and Daniel joined us today, and we discussed the eventual setup w.r.t. logs and the database.

Log processing

A setup for handling the logs at different stages of processing was proposed as follows:

1 - a script saves new buildbot logs to a directory, say 'incoming_sendjson'

2 - A Flume pipeline uses one of the exec sources [1] to execute a script that invokes the log parser script on batches of buildbot logs sitting in the 'incoming_sendjson' directory. For each one of those files, the script will emit single line JSON objects to stdout. When the Flume agent invokes this script, it will store each line of stdout into the Hive table for JSON results. The script should move each successfully processed buildbot log file from 'incoming_sendjson' to 'incoming_sendlog'

3 - Another Flume pipeline will exec a separate script that will process batches of buildlogs sitting in the 'incoming_sendlog'. This script should cat each log file, prepending tab separated metadata in the following format (' \t ' represents a tab in the output):

 repo \t platform \t debug-or-opt \t builddate \t testsuite \t line-num \t single-log-line

Example buildbot log input

 mozilla-central-macosx64-debug/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 1
 mozilla-central-macosx64-debug/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 2
 mozilla-central-macosx64-debug/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 3
 mozilla-central-macosx64-release/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 1
 mozilla-central-macosx64-release/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 2
 mozilla-central-macosx64-release/1288731175/mozilla-central_snowleopard-debug_test-crashtest-build94.txt.gz: line 3

Example sendlog script stdout output

 mozilla-central \t macosx64 \t debug \t 1288731175 \t crashtest \t 1 \t line 1
 mozilla-central \t macosx64 \t debug \t 1288731175 \t crashtest \t 2 \t line 2
 mozilla-central \t macosx64 \t debug \t 1288731175 \t crashtest \t 3 \t line 3
 mozilla-central \t macosx64 \t release \t 1288731175 \t crashtest \t 1 \t line 1
 mozilla-central \t macosx64 \t release \t 1288731175 \t crashtest \t 2 \t line 2
 mozilla-central \t macosx64 \t release \t 1288731175 \t crashtest \t 3 \t line 3

After catting, each log is moved to a 'processed' directory.

Each of the these properties will wind up being columns in the logdata database, so we should be able associate these logs with the parsed log data. If the parsed log data includes indexes into these logs, then we'll be able to retrieve a portion of the log that represents the execution of any individual test.

Orange Factor

Jeff has added a 'Testing: Orange Factor' component in bugzilla, and plans on filing a lot of bugs with ideas for improving the site.

Joel has moved the website to brasstacks, and it has increased in speed mightily! He's changed the main display to show a daily orange factor calculation, rather than a cumulative average. Jeff has some ideas of how to apply some smoothing algorithms to this so we can reduce the spikiness of the graph. Mark plans to update the page titles to be distinct.

We discussed the fact that the new db will give us data on test failures, while the current web site shows data on bugs. Once we switch to the new db, how will we associate failures with bugs?

We'll investigate adding a little script to the pushlog webpage which sends our db an XHR each time a failure is starred. Another option is to use the pushlog algorithm which guesses which bug a failure is related to.