ReleaseEngineering/Buildslave Startup Process

< ReleaseEngineering
Revision as of 23:12, 19 April 2011 by Djmitche (talk | contribs) (→‎Windows XP: Trac, you rot my brain.)

Buildslave Source

Note that the version of Buildbot installed on the slaves is not the same as that on the master. See the ReleaseEngineering/Buildslave Versions for details. The remainder of this page describes how that version is started.

General (runslave)

In general, the idea is to get around to running runslave.py, which takes care of contacting the slave allocator, setting up buidlbot.tac, and starting the buildslave process.

Linux

Build (CentOS 5)

/etc/init.d/buildbot depends on /etc/init.d/puppet, which blocks until puppet run to completion. /etc/init.d/buildbot runs runslave.py as the cltbld user.

Builds run in /builds/slave.

Test (Fedora 12)

On startup, test boxes automatically login, and run home/cltbld/.config/autostart/gnome-terminal.desktop. This file runs /home/cltbld/run-puppet-and-buildbot.sh, which runs puppet in a loop, and when that's complete, runs runslave.py.

Darwin

Build

The /Library/LaunchDaemons/com.reductivelabs.puppet.plist runs /usr/local/bin/sleep-and-run-puppet.sh as root, which is presumably installed as part of the base image. This script sleeps for 60 seconds, then runs puppet in the foreground every 60 seconds until it succeeds.

The buildbot launchd script, /Library/LaunchAgents/org.mozilla.build.buildslave.plist, waits until puppet has run, and then invokes 'buildbot start' directly. It runs as whatever user logs in on the GUI console, which had best be cltbld.

Windows

Build

buildbot.bat is in cltbld's "Startup Items". It is installed via OPSI, but it's not in the OPSI hg repo, because it contains passwords. buildbot.bat runs buildbot-tac.py, via a checkout of http://hg.mozilla.org/build/tools at d:\tools. That checkout is only done once (in the refimage, I think)

Once buildbot-tac.py has created the tac file, buildbot.bat runs start-buildbot.bat, which sets up some VC++ variables (using guess-msvc.bat, which is part of MozillaBuild) and runs start-buildbot.sh, which finally runs 'buildbot start' in the appropriate directory.

Both start-buildbot.bat and start-buildbot.sh have loops in them to try to start buildbot multiple times. These were introduced in bug 550815 because (by my read) a race condition with some OPSI startup stuff was occasionally killing the buildslave during the startup process.

Test

Windows XP

startTalos.bat is in the Startup Items subfolder of the Programs folder, so at login by cltbld (which happens automatically), this batch file runs. It skips tac file generation for refimages, or if the tac file or a control file (c:\buildbot-tac.control) already exists. Otherwise it invokes c:\tools\buildbot-helpers\buildbot-tac.py to generate the tac file. Once that is done, it sets the screen size (using dc.exe), stops and starts Apache, empties the twistd.log, waits 60 seconds, and starts the slave. There's no loop here, as there is for build slaves.

Windows 7

Talos Windows 7 systems are not administered by OPSI, so the various scripts are installed by hand. I don't know how they work yet