ReleaseEngineering:BuildSlaveSetup
After imaging
Once a slave has been re-imaged, the PostImage steps should be completed by IT and other post imaging steps (ReleaseEngineering/How To/Set Up a Freshly Imaged Slave) done within releng. The slaves should be attached to a staging master. These slaves should be run in this environment to make sure there are no issues with them. They use a special set of staging keys. Before letting the machines bake for a long time, you should ensure that the slave is connected to the staging master. It should also be able to ssh into staging-stage:
ssh -i ~/.ssh/ffxbld_dsa ffxbld@staging-stage.build.mozilla.org
Once this is done, wait for a number of different types of jobs to cycle as green. If the slave is in good working order, it needs to be moved into production. To do this, you'll need to stop the slave, clobber the slave, change to production ssh keys then modify the buildbot.tac file. For that, see ReleaseEngineering/How To/Move a Slave Between Production and Staging.
Stopping slave
You can stop the slaves by running:
buildbot stop /builds/slave
for linux and mac. For windows, run:
buildbot stop /e/builds/moz2_slave
in an msys command prompt
Clobbering
Windows needs to have everything other than buildbot.tac removed from E:\builds\moz2_slave\ (or /e/builds/moz2_slave in msys). One way to accomplish this is to launch an Msys terminal and run:
cd /e/builds/moz2_slave mv buildbot.tac ~ rm -rf * mv ~/buildbot.tac .
Linux also has scratchbox. Scratchbox builds aren't done under the standard slave directory (yet bug 576830). To clobber a linux slave you could run:
cd /builds/slave mv buildbot.tac ~ rm -rf * rm -rf /builds/scratchbox/users/cltbld/home/cltbld/build mv ~/buildbot.tac .
Mac has the slave slave build dir as linux, but does not have scratchbox. Mac slaves can be clobbered by running:
cd /builds/slave mv buildbot.tac ~ rm -rf * mv ~/buildbot.tac .
SSH Keys
Before modifying the buildbot.tac file, change over the ssh keys. It is recommended that you do this before the buildbot.tac changes because if the slave reboots after the buildbot.tac change but before the ssh keys are set up, it will go into production and kill builds because of failed uploads.
There are three sets of keys that are important: staging, production and try. Aside from a strange permissions problem on linux (.ssh is root:root owned), the process is roughly consistent on all three platforms. missing info: not sure what to put here for getting the keys on the slave, be creative for now.
To test that you have the staging keys and they are set up properly, try:
ssh -i ~/.ssh/ffxbld_dsa ffxbld@staging-stage.build.mozilla.org
To test that a production master slave is set up properly, you must be able to run the following commands:
ssh -i ~/.ssh/ffxbld_dsa ffxbld@aus2-staging.mozilla.org hostname ssh -i ~/.ssh/ffxbld_dsa ffxbld@dm-symbolpush01.mozilla.org hostname ssh -i ~/.ssh/ffxbld_dsa ffxbld@stage.mozilla.org hostname ssh -i ~/.ssh/ffxbld_dsa ffxbld@stage-old.mozilla.org hostname ssh -i ~/.ssh/ffxbld_dsa ffxbld@hg.mozilla.org hostname ssh -i ~/.ssh/ffxbld_dsa ffxbld@cvs.mozilla.org hostname
Try builders use different keys!
You must wipe any ssh keys that are not trybld from a newly imaged slave, and copy in the trybld keys from another try builder (staging trybld keys are on the staging slaves)
To test that a try slave is set up properly, you must be able to run the following commands:
ssh -i ~/.ssh/trybld_dsa trybld@stage.mozilla.org hostname
buildbot.tac changes
Settings that are important to change in the buildbot.tac file are:
- buildmaster_host -- this is the buildbot master the slave reports to
- port -- this is the slave port the slave will attempt to connect to the master on.
- passwd -- this is the buildbot password. Each running master has a BuildSlave.py file in the masterdir that has this password
Environments that you are likely going to put the slaves into are:
- production-master01.build.mozilla.org:9010 -- standard per-checking and nightly builds
- production-master02.build.mozilla.org:9010 -- subset of mobile builds and releases
- production-master02.build.mozilla.org:9011 -- all try builds (special buildbot.tac passwd)
- production-master03.build.mozilla.org:9010 -- similar to pm01:9010
Once all of these steps are completed, reboot the slave and check the appropriate buildslaves page on the master. We generally have our HTTP interface for the masters running on a port that is 1000 less than the slave port (a slave port of 9010 uses http 8010).
The URL for the page to check can be derived using
'http://%s:%s/buildslaves/%s' % (master_fqdn, master_http_port, slave_short_name)
Examples:
http://production-master01.build.mozilla.org:8010/buildslaves/linux-ix-slave01 http://production-master02.build.mozilla.org:8011/buildslaves/w32-ix-slave01