ReleaseEngineering/Applications/Bumper

< ReleaseEngineering‎ | Applications
Revision as of 04:30, 29 January 2016 by Hwine (talk | contribs) (more debugging notes)

Bumper

B2g Manifests

B2G manifests define all the git projects required to build B2G for a given device. An example manifest is https://github.com/mozilla-b2g/b2g-manifest/blob/master/dolphin.xml.

B2g Bumper

B2G bumper is responsible for taking the original manifests from https://github.com/mozilla-b2g/b2g-manifest/ and processing them for our build system. What this generally means is to change references to remote git repositories (like on github) to references to our mirrors on git.m.o. We also inline include files, and get specific commit ids for all the projects.

The bumper runs periodically (cronjob on host) and keeps all the in-tree manifests up to date. If one of the upstream repositories changes, then this is reflected in a commit to gecko, e.g. https://hg.mozilla.org/integration/b2g-inbound/rev/76635be3bc3f ( here the 3rd party repository 'apitrace' was changed. b2g_bumper detected this change, and updated all the commit ids in the in-tree manifests)

Where is bumper's code & configs

Where and when does bumper run?

b2g_bumper runs on buildbot-master66.bb.releng.usw2.mozilla.com (see /etc/cron.d/run_b2g_bumper)

How is bumper deployed?

Deployment is semi-automated:

  • Code should first be landed on the default branch, and pushed.
  • Monitor #releng for a success message from travis-ci
    • If you don't see a success or fail message within 10 minutes, something is very wrong (e.g. syntax error).
      • Be sure each of the applications scripts and config files can be imported successfully:
     cd scripts
     PYTHONPATH=.. python -c 'import b2g_bumper'
     cd ../configs/b2g_bumper
     for f in *py; do python $f; done
  • After the success message is seen for default branch, merge to production and push:
     hg checkout production
     hg merge default
     hg commit -m 'merged default->production; tests passing'
     hg push
  • The new version will be used the next time b2g_bumper executes. That may be in advance on the travis-ci report on the production branch.

Troubleshooting

Logs for bumper runs are located in /builds/b2g_bumper/${version}/logs, and should be the first resource to check for troubleshooting.

Nagios alert re "stale stamp"

If the only hint you have is a nagios alert in #buildduty about a stale stamp, check /builds/b2g_bumper/b2g_bumper.log first. Look for a line similar to:

 2016-01-11 12:20:02 pid-9193 Failures! Not updating /builds/b2g_bumper/b2g_bumper.stamp

Directly above that, look for lines with "Failed on", similar to:

 2016-01-11 12:19:00 pid-9193 Failed on master: check /builds/b2g_bumper/master.log (exit code 0)

In the indicated log file, look for line(s) containing 'fatal', similar to:

 11:55:39  WARNING - https://git.mozilla.org/b2g/device-sony-seagull:refs/heads/sony-aosp-l - got output: fatal: repository 'https://git.mozilla.org/b2g/device-sony-seagull/' not found
Stopping b2g_bumper runs

You can't just edit the cron job, or tweak the run_b2g_bumper.sh script -- both will be updated by puppet. A way that works is:

 cd /builds/b2g_bumper
 touch YourName.lock
 while ! ln YourName.lock b2g_bumper.lock ; do sleep 30; done
 # you now have can do what you need to do

To restart, simply rm YourName.lock, and the next cron cycle will fire.

Missing Configuration for Manifest Remote

If there has been a recent commit to b2g-manifest, check to see if a new (or misspelled) remote exists. All remotes in the manifest need to be described in the branch's bumper config file. (E.g. this config change for this manifest change.)

Stalled Fetches from git.mozilla.org

NOTE: please log all occurrences in bug bumper_hang_from_git so we can track and fix.

A (relatively) common issue is the presence of hung processes running git commands; in this case "kill -hup" of the hanging process is enough to solve the issues and next scheduled bumper executions run successfully (e.g.: bug 1040062). To see if this condition exists:

 ps flwww -s $(pgrep bumper)

If it does, then note PID of leaf process and:

 kill -hup $PID_OF_GIT_PROCESS

Assuming that works, update bug bumper_hang_from_git with copy/paste of all output from above.

How to play catchup

Sometimes, a gecko branch managed by bumper has it's state get wonky (bad commit, changing branches, bustage...). In those cases, update the b2g/config/gaia.json file on the gecko branch (value of gecko_push_url in config file) to be the correct value from the HG version of git (value of gaia_repo_url in config file).