ReleaseEngineering/How To/Restart Buildbot Masters: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(Moved the script under Coop's account from dev-master2 to buildduty-tools under Buildduty's account)
(libraries)
Line 27: Line 27:
Important notes:
Important notes:
  Before restarting the masters the script make sure that:
  Before restarting the masters the script make sure that:
  -a patch is added to the "tools" repository so that logs can be seen in /home/buildduty/restart_masters/tools/buildfarm/maintanance/3.txt and some libraries are added.
  -a patch is added to the "tools" repository so that logs can be seen in /home/buildduty/restart_masters/tools/buildfarm/maintanance/3.txt and some libraries are imported.
  -comment configs from /home/buildduty/.ssh/config that are setup in puppet which prevent all buildduty users to connect with password to other servers (Host *mozilla.com *mozilla.org BatchMode yes)
  -comment configs from /home/buildduty/.ssh/config that are setup in puppet which prevent all buildduty users to connect with password to other servers (Host *mozilla.com *mozilla.org BatchMode yes)



Revision as of 12:49, 27 September 2016

We occasionally need to restart buildbot masters for various reasons:

  • upgrades to the underlying OS
  • gradual increase in memory usage over time, leading to reduced master performance

Manually

If you need to restart a single master by hand, here's the sequence you should follow:

  • disable the master in slavealloc. This prevents the master from taking more slave connections while you're waiting for it to shutdown.
  • click the "Clean Shutdown" button on the web interface for the given master, e.g. http://buildbot-master82.bb.releng.scl3.mozilla.com:8001/
  • wait for the jobs currently running on that master to complete. You can track progress by searching in-page for "Running" on the master's buildslaves page, e.g. http://buildbot-master82.bb.releng.scl3.mozilla.com:8001/buildslaves?no_builders=1
  • once the master is shutdown, perform whatever upgrades are required, etc.
  • restart the master. """NOTE:""" buildbot masters are configured to restart buildbot automatically on boot, so if you reboot the master, buildbot will restart itself. To restart manually:
xebec:buildduty ccooper$ ssh cltbld@buildbot-master82
Unauthorized access prohibited
[cltbld@buildbot-master82.bb.releng.scl3.mozilla.com ~]$ cd /builds/buildbot/build1/
[cltbld@buildbot-master82.bb.releng.scl3.mozilla.com build1]$ source bin/activate
(build1)[cltbld@buildbot-master82.bb.releng.scl3.mozilla.com build1]$ make start

By script

The above actions have been encapsulated into a script: https://hg.mozilla.org/build/tools/file/default/buildfarm/maintenance/restart_masters.py

The script requires a bash-format config file like the one used by the end_to_end_reconfig.sh script. At the very least the config file must define values for LDAP_USERNAME, LDAP_PASSWORD, and CLTBLD_PASSWORD.

The script is setup to run on buildduty-tools.srv.releng.usw2.mozilla.com in a venv under buildduty's account ,the venv is set on this path: /home/buildduty/restart_masters and the script restart_masters.sh is located in /home/buildduty(bug 1299421).

Important notes:

Before restarting the masters the script make sure that:
-a patch is added to the "tools" repository so that logs can be seen in /home/buildduty/restart_masters/tools/buildfarm/maintanance/3.txt and some libraries are imported.
-comment configs from /home/buildduty/.ssh/config that are setup in puppet which prevent all buildduty users to connect with password to other servers (Host *mozilla.com *mozilla.org BatchMode yes)

Here is an example invocation:

# buildduty-tools
$ screen -R restart_masters
$ cd ~buildduty/restart_masters
$ source bin/activate
$ cd tools/buildfarm/maintenance/
$ ./restart_masters.py -v -m production-masters.json 2>&1

Automated

The above script requires sensitive credentials that shouldn't be stored on disk. For now, we're still running this script by hand.