Build:Release Automation
Intro
Firefox and Thunderbird releases are currently done using the Bootstrap automation scripts, which call into Tinderbox client to do the actual build.
Bootstrap
Bootstrap is a simple Perl framework intended to take the formerly manual release process and automate it, with as little change to the process as possible.
Bootstrap is invoked using the "release" command, and supports a set of high-level "steps":
Tag - tag, branch, apply version bumps to all relevant files.
TinderConfig - generate tinderbox config files (mozconfig/tinder-config.pl)
Build - invoke Tinderbox client to create and en-US build and publish to FTP
Source - create a source tarball and push it to FTP
Repack - invoke Tinderbox client to create localized versions of en-US build and publish to FTP
PatcherConfig - create a Patcher config file for generating updates
Updates - invoke Patcher to create partial updates and AUS configuration
Stage - create a staging area and rename files for release
Sign - not implemented
Buildbot
We have a vendor branch in mozilla/tools/buildbot, based on Buildbot's 0.7.5 release.
Mozilla-specific Buildbot install instructions
Buildbot-driven release
For the Firefox 2.0.0.7 release, we used Buildbot to drive the release. Both the staging and production configs are checked into CVS.
There are still several manual processes, which we are working on:
- Tag - had to manually tag based on GECKO181_20070712_RELBRANCH
- should work now, tested on staging bug 396290
- Source
- must be run on stage, need to rewrite source step bug 394034
- manually sync build-console and stage bug 396438
- Build
- manually sync build-console and stage bug 396438
- Repack
- had to fall back to cerberus-vm due to EOL problems bug 397842
- manually sync build-console and stage bug 396438
- Sign
- still manual
- Updates
- had to change stagingServer to "stage" and re-run configs bug 396438
- manually sync build-console and stage bug 396438
- had to correct permissions for both snippets and MARs
- update verification config is still manual bug 373995
- Stage
- had to correct permissions
- created "latest" and "latest-2.0" symlinks manually after final release
- created bouncer links manually bug 372746
Bootstrap Steps
A Bootstrap "step" must implement 2 required methods:
Execute - carry out the actual function of the step, e.g. Build
Verify - run an automated test
Additionally, there are 2 optional methods:
Push - upload the appropriate changes for testing, e.g. upload build to FTP
Announce - send an email announcing that the step has finished.
Using Bootstrap
If the "release" command is invoked with no parameters, it will attempt to start at the first step and call the methods in this order:
- Execute
- Verify
- Push
- Announce
As each step completes successfully, the next will be invoked.
There are several command-line options, shown by calling "release -h":
Usage: release [-l] [-s Step] [-o Step] [-e | -v | -p | -a] [-h] -l list all Steps -s start at Step -o only run one Step -e only run Execute -v only run Verify -p only run Push -a only run Announce -h this usage message
For example, to only run the Push method on the Build step:
./release -o Build -p
Roles and resource requirements
- buildbot master
- keeps logs, manages overall process
- ftp/stage.m.o
- fileserver, both public and private areas
- FTP candidates - 20GB storage
- e.g. stage:/home/ftp/pub/firefox/nightly/2.0.0.4-candidates/
- FTP private staging - 20GB storage
- e.g. stage:firefox-2.0.0.4/
- FTP release - 6GB storage
- e.g. stage:/home/ftp/pub/firefox/releases/2.0.0.4/
- "tagging" builder
- checks out source and applies tag
- 2GB storage
- e.g. karma:/builds/tags/FIREFOX_2_0_0_4_RELEASE/
- "source archive" builder
- builds source archive and pushes for QA
- "linux/mac/win32 firefox builders"
- builds firefox and pushes for QA
- needs 2GB memory, 6GB storage (each)
- e.g. prometheus-vm:/builds/tinderbox/Fx-Mozilla1.8-Release/
- "updates builder"
- downloads and inventories a set of complete firefox updates, generates partial updates, creates AUS configuration ("snippets")
- updates - 1GB memory, 5GB storage
- e.g. prometheus-vm:/builds/updates/firefox-2.0.0.4/
- "stage builder"
- creates private staging area on FTP, renames files for release
- see "fileserver" requirements, above
- Automatic Update Server (AUS), aus2.m.o
- 10GB for config files, backups and staging area
- e.g. /opt/aus2/incoming/3/Firefox/2.0.0.4/, /opt/aus2/snippets/staging/20070523-Fx-2.0.0.4/, /opt/aus2/snippets/backup/20070611-1-pre-20070611-Fx-2.0.0.4.tar.bz2
Notes on staging setup
Buildbot master basedir is ~buildmaster/TestBot
The bootstrap.cfg is pulled from the master dir.
Slaves basedirs are in cltbld's home directory on the appropriate machine, e.g. ~cltbld/linux-slave1
Changes can be inserted with "buildbot sendchange" on the master e.g.:
buildbot sendchange --master=localhost:9989 -u rhelmer -m"latest bootstrap from CVS" test
Bootstrap uses a local CVS mirror, and the "tag", "source", "updates", and "stage" builders are run by a local buildslave.
The bootstrap Makefile has the following targets:
- stage/clean_stage
- create/remove basic fileserver/tag/source/updates/stage environment
- cvsmirror/clean_cvsmirror
- create/remove cvsmirror in /builds/cvsmirror
These targets are hard-coded to prepare for a 2.0.0.4 release.
There must be "cltbld" and "symbols" accounts on the staging FTP server that the build machines' cltbld accounts can connect to via SSH without a password.
- must accept staging-build-console's hostkey via this SSH tunnel:
- set up staging FTP server
mkdir /home/ftp /builds /data/cltbld chown cltbld /home/ftp /builds/ /data/cltbld cvs co /mofo/release/stage/ to /data/cltbld/bin groupadd firefox
- set up staging AUS server
# TODO - auto-update mkdir -p /opt/aus2/snippets/staging/backup /opt/aus2/incoming /opt/aus2/app
# check out aus2 cd /opt/aus2/ cvs -d /builds/cvsmirror/cvsroot/ co -d app/ -r AUS2_PRODUCTION mozilla/webtools/aus/xml cd app && ln -s ../incoming ./data # install apache yum install httpd
Updating release version (mirror refresh, etc.)
- Bump config versions in mozilla/tools/release/Makefile, mozilla/tools/configs/fx-moz18-staging-bootstrap.cfg, mozilla/tools/buildbot-configs/automation/staging/master.cfg e.g. bug 397425
- Disable cltbld's nightly cronjob
- Refresh cvsmirror
As cltbld@staging-build-console:
cd /home/cltbld/mozilla/tools/release cvs up export CVS_RSH="/home/cltbld/ssh_prod.sh" make cvsmirror
- Update bootstrap and buildbot configs. These are symlinked from bootstrap-configs and buildbot-configs checkouts (of mozilla/tools/release/configs/ and mozilla/tools/buildbot-configs/automation/staging/, respectively).
As buildmaster@staging-build-console:
cd /home/buildmaster/TestBot buildbot stop `pwd` cd bootstrap-configs && cvs up && cd ../ cd buildbot-configs && cvs up && cd ../ buildbot start `pwd`
NOTE - the Talkback symbol server is hardcoded in /builds/cvsmirror.clean/mofo/talkback/fullsoft/Makefile.in, this should be changed like so:
FC_TUNNEL = ssh -$(FC_SSH_VERSION) -f -L 8080:hal:80 $(LSSH_USER)staging-build-console.build.mozilla.org sleep 20 SYM_TUNNEL = ssh -$(SYM_SSH_VERSION) -f -L 2222:localhost:22 $(LSSH_USER)staging-build-console.build.mozilla.org sleep 20
Production setup HOWTO for linux/mac/win32
- build-console setup
- check out /mofo/release/stage to /data/cltbld/bin
- NOTE - this is for the firefox-src-tarball-nobuild script, which checks out a tag from CVS and creates a source archive. This should be reimplemented in the bootstrap Source step
- check out /mofo/release/stage to /data/cltbld/bin
- (Win32/Mac only) install Config::General
cd /tools/dist wget http://search.cpan.org/CPAN/authors/id/T/TL/TLINDEN/Config-General-2.33.tar.gz tar xfvz Config-General-2.33.tar.gz cd Config-General-2.33 perl Makefile.PL
its ok to ignore the warning from "perl Makefile.PL": Warning: the following files are missing in your kit: t/test.rc.out
sudo make install
- (Linux only) prepend custom GCC to the path in ~/.bash_profile
export PATH="/usr/gcc-3.3.2rh/bin:/opt/local/bin:/tools/buildbot/bin:/tools/twisted/bin:/tools/twisted-core/bin:$PYTHONHOME/bin:$PATH"
- create logs dir
$ mkdir -p /tools/dist/logs $ mkdir -p /builds/logs
- (Mac only) Install 7z. You can download it. Or you can copy it from bm-xserve01, which is what we did here. By putting the file in /usr/bin, it is automatically on the PATH of cltbld's .profile.
$ cd /usr/bin $ sudo rsync -av cltbld@bm-xserve01.build.mozilla.org:/usr/local/bin/7z .
- look for Tinderbox directory
#linux: if tinderbox name is not "Fx-Mozilla1.8-Release" exactly, symlink it ln -s /builds/tinderbox/Fx-Mozilla1.8-release /builds/tinderbox/Fx-Mozilla1.8-Release
Check out tinderbox configs:
# win32 cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/win32 # linux cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/linux # macosx cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/macosx
- set up Tinderbox l10n build directory
# linux cd /builds/tinderbox/ # win32 cd /cygdrive/c/builds/tinderbox/
mkdir Fx-Mozilla-1.8-l10n-Release cd Fx-Mozilla-1.8-l10n-Release ../mozilla/tools/tinderbox/install-links rm build-seamonkey.pl ln -s ../mozilla/tools/tinderbox/build-firefox.pl . ln -s build-firefox.pl build-seamonkey.pl rm post-mozilla.pl ln -s post-mozilla-release.pl post-mozilla.pl
Check out tinderbox configs:
# win32 cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_l10n_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/win32 # linux cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_l10n_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/linux # macosx cvs -d cltbld@cvs.mozilla.org:/cvsroot co -r MOZILLA_1_8_BRANCH_l10n_release -d tinderbox-configs mozilla/tools/tinderbox-configs/firefox/macosx
ln -s tinderbox-configs/mozconfig . ln -s tinderbox-configs/tinder-config.pl .
- Install buildbot
- running as "cltbld", install slave
#linux $ cd ~ $ buildbot create linux-slave1 build-console.build.mozilla.org:9989 linux-slave1 password #win32 c:\\buildtools\\python24\\scripts\\buildbot create-slave c:\\win32-slave1 build-console.build.mozilla.org:9989 win32-slave1 password
- edit the admin and host pages in ~/linux-slave1/info/
- start slave
#linux buildbot start /home/cltbld/linux-slave1 # win32 c:\\buildtools\\python24\\scripts\\buildbot start c:\\win32-slave1
Just for testing
- build-console
- use "stage" target in bootstrap's Makefile
- Move prod ssh keys out of the way, and copy in "staging" keys:
cd ~ mv ~/.ssh ~/ssh.prod scp cltbld@staging-prometheus-vm:~/.ssh/id_rsa .ssh/
- Move prod tinderbox-configs and put staging-build-console in Root:
# win32 cd /cygdrive/c/builds/tinderbox/Fx-Mozilla-1.8-Release # linux cd /builds/tinderbox/Fx-Mozilla-1.8-Release
cp -rp tinderbox-configs tinderbox-configs.prod # change root to cltbld@staging-build-console.build.mozilla.org:/builds/cvsmirror/cvsroot vi tinderbox-configs/CVS/Root
Same for l10n tinderbox build directories:
# win32 cd /cygdrive/c/builds/tinderbox/Fx-Mozilla-1.8-l10n-Release # linux cd /builds/tinderbox/Fx-Mozilla-1.8-l10n-Release
cp -rp tinderbox-configs tinderbox-configs.prod # change root to cltbld@staging-build-console.build.mozilla.org:/builds/cvsmirror/cvsroot vi tinderbox-configs/CVS/Root
- /data/cltbld/bin/firefox-src-tarball-nobuild has a hardcoded CVSROOT; change it to cltbld@staging-build-console.build.mozilla.org:/builds/cvsmirror/cvsroot
Production changes
Changing roles
- move to dedicated machines, e.g. production-prometheus-vm
- CVS tag on linux slave or on build-console? build-console
- l10nverify on mac slave ok, need to fix "unpack all xpis bug"
- Mac - is identical hardware req'd? What happens if prod hardware dies? fireball still worked on, scarce PPC hardware options.
Available PPCs:
- 01 - head node
- 02 - production
- 03 - given to community
- 04 - 1.8.0
- 05 - dead
- 06 - given to community
- fireball - unknown
discussed: planned switch to Intel.
Later, more PPC hardware brought online, so decided to not switch to Intel as part of the automation rollout.
Staging/Production Buildbot master differences
- Signing - prod waits for signed bits, stage fakes w/ symlink ok
- Bootstrap - prod pulls tag e.g. RELEASE_AUTOMATION_M5, staging pulls tip ok
Outstanding issues
- How to handle bootstrap logs.. remove them between runs? Don't want accumulation on slaves remove at start
- How to do mock release.. fake version (e.g. 1.2.3.4)? Early 2.0.0.7, that we know we won't release? 2007 rc1
- "Source" and "Staging" steps - install a buildslave on stage, or stage everything on build-console? use build-console
- Make sure QA checks e.g. top 5 extensions after Mac Intel switch
Caveats
Manual steps
NOTE - manual steps should be done in this order
- bootstrap configuration
- kicking off buildbot ("buildbot sendchange ...")
- update verification config (working on this in bug 373995. For now, need to modify and check in the appropriate update configs, after all en-US builds but before updates
- win32 signing, after win32 l10n repack but before updates
- final installer signing
Bugs
- (bug 394963) Need to use cvs (in master.cfg) from ShellCommand to make sure that we always use the proper bootstrap tag
- (bug 373995) update verification should not need configuration
- (needs bug filed) "scp -r" does not work on pacifica-vm; needed for l10n
- Seems that "scp -r" would copy a directory tree to a new directory tree, but would not create all the files underneath the destination subdirectories.
- Note: emails refer to prometheus-vm, but this note refers to pacifica-vm. I tested both.
- I just tried "scp -r" on both pacifica-vm.b.m.o and also staging-pacifica-vm.b.m.o, staging-prometheus-vm.b.m.o and prometheus-vm. On all systems, it worked-for-me. Each time, I copied a multi-level-sub-dir tree, and confirmed that all files were copied correctly in each of the multi-level dirs. joduinn 30aug2007.
- Turns out to be win32 specific, and also specific to copying between machines (works fine locally, which my experiments were). Debating if we care about this on 1.8, as Tinderbox has workaround in place.
- (bug 394500) FTP area keeps getting set read-only; could be a bug in the rsync from the build machines, or maybe in the initial staging FTP area setup?
- (build) permissions on all ${os}_info.txt files were read-only user
- (updates) permissions on AUS config (snippets) and partial MARs were wrong
- (stage) permissions problems for non-stage-merged dirs; did 775 for dirs an 664 for files. group perms seem ok, and batch-skel/stage-merged ok as well.
- bug 396438 bootstrap needs to automatically sync with stage
- build, repack, sign, updates, stage
- (bug 394034) in private repos, in /mofo/release/stage/firefox-src-tarball-nobuild, there is a script that is called in the "source" step. Has hardcoded CVSROOT which needs to be updated to use ":ext:cltbld@staging-build-console.build.mozilla.org:/builds/cvsmirror/cvsroot"
- might be better to just delete this script and create a makefile target to tar up files. This is the only file we use from the private repo (we think), so if this file is deleted, we can stop using the private repo.
- also this is currently a problem because build-console cannot access anonymous CVS
- (bug 394507) should set buildbot up to mail based on any failures, currently just depend on bootstrap
- (bug 372746) Automatically configure bouncer
- (bug 373995) l10n needs the URL it downloads builds from to be configurable as well
- (bug 397554) Automatically check out, set up, and keep Tinderbox installs up to date
- buildbot bug#68 buildbot default timeout too short. 5sec isnt always enough, and you can get a "timed out" message in the slave logs, even though slave started "normally".
- buildbot bug#85 sometimes buildmaster sees buildslave correctly, confirms ping ok, but never assigns pending work to the slave. Doing "buildmaster refresh" is not enough, you need to do "buildmaster stop/start". Restarting the slave does not help.
- buildbot bug#92 on win32, console output is not logged (goes to the DOS console running buildbot :( )
- buildbot bug#77 file buildbot bug to handle kill on win32. Add details linking to bsmedberg fix.
- buildbot bug#67 link to history for old builds at bottom of page (ala tinderbox server).
- buildbot bug#69 meta-refresh tag for waterfall page
- buildbot bug#78 buildbot UI to contain way to force build dependent steps instead of just doing current step.
- buildbot bug#91 When using the CVS Source step on a Mac OSX slave, if a CVS directory is found on the path, buildbot will attempt to use it as if it were a CVS binary.
- buildbot bug#88 steps which start within a few seconds of each other show as same start time on waterfall page
- (needs bug filed) tinderbox symbol server should be configurable
- temp workaround: tinderbox Makefile.in needs to be hacked:
FC_TUNNEL = ssh -$(FC_SSH_VERSION) -f -L 8080:hal:80 $(LSSH_USER)staging-build-console.build.mozilla.org sleep 20 SYM_TUNNEL = ssh -$(SYM_SSH_VERSION) -f -L 2222:localhost:22 $(LSSH_USER)staging-build-console.build.mozilla.org sleep 20
- (bug 394498) should report on mirror saturation after release
- staging Talkback symbol area (/data/symbols) is owned by "symbols" user, should be cleaned by the "make clean_stage" target but is not currently
- maybe have symbols have a crontab?