ReleaseEngineering/How To/Restart Buildbot Masters
We occasionally need to restart buildbot masters for various reasons:
- upgrades to the underlying OS
- gradual increase in memory usage over time, leading to reduced master performance
Manually
If you need to restart a single master by hand, here's the sequence you should follow:
- disable the master in slavealloc. This prevents the master from taking more slave connections while you're waiting for it to shutdown.
- click the "Clean Shutdown" button on the web interface for the given master, e.g. http://buildbot-master82.bb.releng.scl3.mozilla.com:8001/
- wait for the jobs currently running on that master to complete. You can track progress by searching in-page for "Running" on the master's buildslaves page, e.g. http://buildbot-master82.bb.releng.scl3.mozilla.com:8001/buildslaves?no_builders=1
- once the master is shutdown, perform whatever upgrades are required, etc.
- restart the master. """NOTE:""" buildbot masters are configured to restart buildbot automatically on boot, so if you reboot the master, buildbot will restart itself. To restart manually:
xebec:buildduty ccooper$ ssh cltbld@buildbot-master82 Unauthorized access prohibited [cltbld@buildbot-master82.bb.releng.scl3.mozilla.com ~]$ cd /builds/buildbot/build1/ [cltbld@buildbot-master82.bb.releng.scl3.mozilla.com build1]$ source bin/activate (build1)[cltbld@buildbot-master82.bb.releng.scl3.mozilla.com build1]$ make start
- re-enable the master in slavealloc.
By script
The above actions have been encapsulated into a script: https://hg.mozilla.org/build/tools/file/default/buildfarm/maintenance/restart_masters.py
The script requires a bash-format config file like the one used by the end_to_end_reconfig.sh script. At the very least the config file must define values for LDAP_USERNAME, LDAP_PASSWORD, and CLTBLD_PASSWORD.
The script is setup to run on buildduty-tools.srv.releng.usw2.mozilla.com in a venv under buildduty's account ,the venv is set on this path: /home/buildduty/restart_masters and the script restart_masters.sh is located in /home/buildduty(bug 1299421).
Important notes:
Before restarting the masters the script make sure that: -a patch is added to the "tools" repository so that logs can be seen in /home/buildduty/restart_masters/tools/buildfarm/maintanance/3.txt and some libraries are imported. -comment configs from /home/buildduty/.ssh/config that are setup in puppet which prevent all buildduty users to connect with password to other servers (Host *mozilla.com *mozilla.org BatchMode yes)
Here is an example invocation:
# buildduty-tools $ screen -R restart_masters $ cd ~buildduty/restart_masters $ source bin/activate $ cd tools/buildfarm/maintenance/ $ ./restart_masters.py -v -m production-masters.json 2>&1
Automated
The above script requires sensitive credentials that shouldn't be stored on disk. For now, we're still running this script by hand.