ReleaseEngineering/Applications/Slavealloc

From MozillaWiki
Jump to navigation Jump to search

Application Description

Slavealloc is a client-server application. The client is runslave.py. Communication is via a very basic HTTP request to http://slavealloc.build.mozilla.org/gettac/$slavename, where the response is expected to be a buildbot.tac file suitable for use to start buildslave.

The slavealloc server is a implemented as a small Twisted application (source) which serves the tac generator, a JSON REST interface, and a client-side JavaScript interface.

The same source code also implements a command-line frontend to the REST interface.

Requirements

The server depends on

  • MySQL database
  • Python
    • Twisted
    • SQLAlchemy

External Resources

The slavealloc server uses the following external resources:

  • two MySQL databases (production and staging)
  • VM host for the slaveallocator VM

Security

The slave allocator hands out low-security slave passwords in the .tac files, which are stored in cleartext in the database. It does not do any sort of authentication either for read or modify operations, and relies on the Build VLAN firewalls to prevent external access.

Monitoring

The slavealloc host has the basic host monitoring from nagios (ping, filesystems, etc.), plus an HTTP GET to /api/pools, just to make sure the daemon is still responding.

Deployment

The slave allocator server is deployed on a single host, slavealloc.build.mozilla.org.

Server Setup

IT installed RHEL6 along with MySQL client libraries, and set up the proper firewall rules to allow database access.

As root, virtualenv-1.5.2 was installed into the system Python library. The following system packages were installed via yum:

Nginx

Nginx frontends for both staging and production instances. The virtualhosts files are available in hg.

Note that on the x86_64 system slavealloc is currently installed on, the following must be added to the http section of nginx.conf:

   # required to use virtualhosts - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=547722
   server_names_hash_bucket_size 33;

Twisted Daemon

The 'slavealloc' user runs the twisted daemon on this host. The user account is locked and accessed only via su from root.

The daemon is installed in a virtualenv at /tools/slavealloc-$rev, using the pre-checked python packages on the puppet server. Note that --no-site-packages is not used here, because we need access to the (binary) MySQL-python package which is installed systemwide:

cd /tools
virtualenv slavealloc-8fe4dbc09d03
./slavealloc/bin/pip install -e hg+http://hg.mozilla.org/build/tools@8fe4dbc09d03#egg=tools \
   --no-index --find-links=http://staging-puppet.build.mozilla.org/staging/python-packages/
ln -s slavealloc-8fe4dbc09d03 slavealloc
ln -s slavealloc-8fe4dbc09d03 slavealloc-staging

There is a make-slavealloc-virtualenv.sh script available in /tools to make this process automatic.

Once this was set up, the 'slavealloc dbinit' command was used to initialize the database.

The production and staging tac files are in /build/slavealloc. Staging runs on port 1079, and production runs on 1080. The files are similar to those in hg, and include the commented pool_recycle line, with timeout 400, to automatically expire connections before the MySQL server itself does. If this still occurs, we can look to the example of the Buildbot source for a better solution.

Startup is done via initscripts.

Slave Side

All slaves run runslave.py during startup. This file is distributed via puppet. The larger slave-startup process is described in ReleaseEngineering/Buildslave Startup Process.

Backups

The slavealloc server has

11 4 * * * slavealloc /tools/slavealloc/bin/slavealloc dbdump -D mysql://mumblemumblemumble > /builds/slavealloc/production-1080/dbdump.pkl

in /etc/cron.d/slavealloc-bkup, as a basic protection against someone accidentally running dbinit or doing something equally catastrophic.

Staging

As described above, http://staging-slavealloc.build.mozilla.org/ points to a staging implementation of the slave allocator. This implementation has its own database runs from a distinct daemon, although it is served from the same nginx instance.

NOTE: all slaves are configured use the production allocator. Allocations from the staging allocator will need to be simulated by hand (runslave.py has an command-line option to set the allocator URL). This is done to allow us, as a group, to move slaves between staging and production using a single slave allocator.

The prod-db-to-staging.sh script will copy the production db to staging for use when staging new changes.

Development

First, install build/tools in a virtualenv:

cd tools
virtualenv sandbox
sandbox/bin/pip install -e .

Then you can run the slavealloc daemon locally from the root of the tools repository with a simple:

sandbox/bin/twistd -noy lib/python/slavealloc/contrib/slavealloc-combined.tac

Note that due to what I believe to be a bug in pip, you may need to explicitly install Twisted to get the twistd executable installed:

pip install -U twisted

To set up a fresh database, use

sandbox/bin/slavealloc dbinit -D sqlite:///slavealloc.db

This configuration will use SQLite to access {{{slavealloc.db}}} in the current directory. You can hack on the static web content while the daemon is running.

See Also

See User:Djmitche/Slave Allocator Proposal