ReleaseEngineering/Puppet/Usage
Puppet: Usage | Server Setup | Client Setup
This document is intended to serve as a guide to interacting with our Puppet servers and manifests.
Definitions
- Type - Puppet documentation talks a lot about this. Each different "type" deals with a different aspect of the system. For example, the "user" type can do most things related to user management (passwords, UID/GID, homedirs, shells, etc). The 'package' type deals with package management (eg, apt, rpm, fink, etc). And so on.
Masters
We currently have four masters, with no real rhyme or reason to their hostnames
- staging-puppet.build.mozilla.org (staging, in MPT)
- production-puppet.build.mozilla.org (MPT)
- mv-production-puppet.build.mozilla.org (MV)
- scl-production-puppet.build.scl1.mozilla.com (SCL)
Note that staging-puppet and production-puppet share an NFS-mounted /N, while the other machines use local storage.
The Slave-Master Link
You can find to which puppet master a slave connects to by checking this files' contents:
# for linux testers (fedora) ~cltbld/.config/autostart/gnome-terminal.desktop # for linux builders (centos) /etc/sysconfig/puppet # for osx /Library/LaunchDaemons/com.reductivelabs.puppet.plist
If the slaves have to be moved between masters be sure to remove the certs after you modify this file and before their next reboot. You may also need to do 'puppetca --clean <FQDN>' on the new puppet master.
# for linux rm -rf /var/lib/puppet/ssl # for mac rm -rf /etc/puppet/ssl
Our Puppet Manifests
Out puppet manifests are organized into a few different parts:
- Site files
- Basic includes
- Packages that make changes
Site Files & Basic Includes
Each Puppet master has its own site file which contains a few things:
- Variable definitions specific to that master
- Import statements which load other parts of the manifests
- Node (slave) definitions
The basic includes are located in the 'base' directory. These files set variables with are referenced in the packages as well as base nodes for slaves.
The most important variables to take note of are:
- ${platform_fileroot} -- Used wherever the puppet:// protocol is supported, most notably with the File type.
- ${platform_httproot} -- Used with the Package type and other places that don't support puppet://
There are also ${local_[file,http]} variables which point to the 'local' directory inside of each platform's root. See the following section for more on that.
We have a few base nodes shared by multiple pools of slaves as well as a base node for each concrete slave type. The shared ones are:
- "slave" -- For things common to ALL slaves managed by Puppet
- "build" -- For things common to all build slaves
- "test" -- For things common to all test slaves
There are two different types of concrete nodes. Firstly, we have $platform-$arch-$type" nodes, which are used on all Puppet masters for slaves which are local to them. Two example are: "centos5-i686-build" (32-bit, CentOS 5, build slaves) and "darwin10-i386-test" (32-bit, Mac 10.6, test slaves). Secondly, there are "$location-$type-node" nodes, which only apply to the MPT master. All nodes which are not local to MPT production are listed in its configuration file as this type of node. These nodes ensure that new slaves get redirected to their local master when they first come up. Examples include "mv-build-node" and "staging-test-node".
See base/nodes.pp for the full listing of nodes..
Packages
- The site-{staging,production}.pp files declare the list of slaves and each slave has defined which classes to include.
- The classes buildslave.pp and staging-buildslave.pp include most of the packages (devtools, nagios, mercurial, buildbot, extras, etc) we want.
- The packages can have different sections or "Types" that can be "exec", "user", "package", "file", "service"
Puppet Files
The files that Puppet serves up (using File) are in /N on each puppet master. The MPT masters share this via an NFS mount, so it's easy to sync files from staging to MPT production. The other servers have a local copy of this data.
That first 3 levels of the drive are laid out as follows:
$level/$os-$hardwaremodel/$slaveType
- $level is support level (production, staging, pre-production)
- $os is generally one of 'centos5', 'fedora12', 'darwin9', or 'darwin10'.
- $hardwaremodel is whatever 'facter' identifies the machine's CPU as (x86_64, i686, i386, etc).
- $slaveType is the "type" of node of the slave is: 'build', 'test', 'stage', 'master', etc.
Below '$type', are all of the files served by Puppet. They are organized according to where they'll end up on the slave. For example, if /etc/X11/fonts.conf is to be synced to the slave, it should live in:
etc/X11/fonts.conf
There are two special directories for each level/os/hardwaremodel/type combination, too:
- local -- This directory contains files which should NOT be synced between staging <-> production or between different locations. Files such as the Puppet configs which have different contents depending on location and support level live here.
- DMGs (Mac) / RPMs (Fedora/CentOS) -- These directories contain platform specific packages which Puppet installs.
Common Use Cases
Updating a password
Passwords are stored in a hashed format alongside other user information. We do not put the hashes in a public location for hopefully obvious reasons - please make sure you don't do this by accident.
Let's say you want to update cltbld's password. First, you need to generate the new hash. You can do that by running the following:
makepasswd --clearfrom=- --crypt-md5 # now type the password and hit ^D a couple times
Now, copy and paste that password into /etc/puppet/manifests/build/cltbld.pp as the 'password' for the cltbld user. Do this on all active puppet masters.
Installing a Package
After pushing file deployment over NFS to its limit we replaced it with native package formats for software deployment. This switch was made around June, 2010.
RPM (CentOS, Fedora)
We use a combination of 3rd party and in-house RPMs to deploy to our Linux machines. On the manifest side we use the 'rpm' package provider wrapped in a custom type to ensure installation. For packages which 3rd party RPMs are available for skip down to the manifests.
Spec Files
- Guess what? We have an rpm-sources repo
This section will not go over all the ins and outs of RPM spec file creation, but will talk about which machines it should be done on and show a couple examples. In general, there's two different ways we package things in an RPM. Firstly, there's the more "normal" build & installation in a spec file. Whenever possible, it's best to use this method of deployment. It makes RPM rebuilding simpler and often allows the same spec file to be used across multiple platforms. A simple example of this is the zope.interface package. Its spec file is as follows:
Name: zope.interface Summary: Zope.Interface Version: 3.3.0 Release: 0moz1 License: ??? Group: Python Source: %{name}-%{version}.tar.gz BuildRoot: %{_tmppath}/zope-interface-%{version}-root Requires: python25 %define _python /tools/python-2.5.1/bin/python %define _prefix /tools/zope-interface-%{version} %define _localstatedir %{_prefix}/var %define _mandir %{_prefix}/man %define _infodir %{_prefix}/info %description %{name} %prep %setup -q %build %{_python} setup.py build_ext %install rm -rf $RPM_BUILD_ROOT %{_python} setup.py install --root=$RPM_BUILD_ROOT --prefix=%{_prefix} %clean rm -fr $RPM_BUILD_ROOT %files %defattr(-, root, root) %{_prefix}
Note that the package installs itself into $RPM_BUILD_ROOT and still sets an installation prefix. This ends up translate to $tmpdir/tools/zope-interface-3.3.0, in this case.
When creating a spec file for packages like this you can generally put whatever you would use to install it manually into the appropriate sections (%prep, %build, %install) with the caveat of making sure they install into $RPM_BUILD_ROOT. The %files section instructs rpmbuild to add any files listed in it to the RPM.
For packages that cannot be created through these means, a more brute force technique is used. Rather than installing the package from its original source in the spec file, the install is done outside of the spec file and packaged up. The spec file then takes that package and puts it into an RPM. Some things (most notably Scratchbox and Android SDKs) make it difficult or impossible to reliably reproduce the installations. Here's an example of this technique:
Name: gcc433 Summary: An interpreted, interactive, object-oriented programming language. Version: 4.3.3 Release: 0moz1 License: ??? Group: C # This isn't the original source package but rather a Mozilla built tarball. # The original source package requires downloading additional pieces from # the Internet, which is difficult to do in RPM, and makes reproducability # impossible. Source0: gcc433-4.3.3.tgz BuildRoot: %{_tmppath}/%{name}-%{version}-root AutoReqProv: no %define __strip /bin/true %define __os_install_post %{nil} %define toplevel_dir gcc-4.3.3 %define install_dir %{toplevel_dir} %description %{name} %prep rm -rf $RPM_BUILD_DIR/%{toplevel_dir} tar -zxf %{SOURCE0} >/dev/null %build # none %install install -d -m 755 $RPM_BUILD_ROOT mkdir -p $RPM_BUILD_ROOT/tools/%{install_dir} rsync -av %{toplevel_dir}/ $RPM_BUILD_ROOT/tools/%{install_dir} %clean rm -fr $RPM_BUILD_ROOT %files %defattr(-, root, root) /tools/%{install_dir}
Note that the build section is empty here, and the %install section simply creates the destination directory and uses rsync to populate it. Also of note here is the disabling of some RPM features: AutoReqProv: no, %define __strip /bin/true, and %define __os_post_install %{nil} disable the built-in dependency checking, and some optimizations that rpmbuild normally performs. These are usually required when deploying packages this way.
Building RPMs
Once you have a spec file ready and the source in hand building RPMs is pretty simple. Generally, we use staging-master (32-bit CentOS), moz2-linux64-slave07 (64-bit CentOS), and talos-r3-fed{64,}-slave01 (Fedora 32/64) as build machines for RPMs. Each of them has an 'rpmbuild' directory inside of cltbld's home directory. The directory structure inside looks as follows:
BUILD RPMS SOURCES SPECS SRPMS
Source files referenced in spec files go in SOURCES and end-result RPMs go in RPMS/$arch.
To build an RPM, head to the SPECS directory and type the following:
rpmbuild -ba --target=$arch specfile
Where $arch is 'i686' for 32-bit machines and 'x86_64' for 64-bit ones. If all goes well, you'll soon have an RPM.
Manifests
The manifests are pretty simple once have an RPM. We use a wrapper type called 'install_rpm' to perform installation. You can use it as follows:
install_rpm { "gcc433-4.3.3-0moz1": creates => "/tools/gcc-4.3.3/installed/bin/gcc", pkgname => "gcc433"; }
The name needs to match the package name + version. Note that RPM requires a 'vendor' version, which is where the 0moz1 comes from. Creates needs to be a file that the package creates, preferably the last one to get installed.
DMG+pkg (Mac)
Mac machines use pkg installers wrapped in a DMG file as a package format. On the manifest side, we use the 'pkgdmg' package provider wrapped in a custom type to deploy them. For things which are distributed in a DMG+pkg (such as Xcode) you can skip down to the manifests.
When an upstream DMG file is not available it needs to be created by hand. To do this, we do a manual installation once and then using a script to create the DMG+pkg. Here's an example, which creates a Python 2.5.2 DMG:
# The installation tar jxvf Python-2.5.2.tar.bz2 cd Python-2.5.2 ./configure --prefix=/tools/python-2.5.2 make make install cd .. # DMG creation hg clone http://hg.mozilla.org/build/puppet-manifests ./puppet-manifests/create-dmg.sh /tools/python-2.5.2 python-2.5.2 python /tools
The first argument to create-dmg.sh is the directory to package, which will include the directory itself. The second argument is the name to use on the DMG/pkg filenames. The third is the string to use in the package identifier, it must be alphanumeric only. Lastly, the directory to install the package to.
On the manifests side of things a simple use of the install_dmg type will ensure a package gets installed:
install_dmg { "python-2.5.2.dmg": creates => "/tools/python-2.5.2/share", }
The argument to "creates" should be one of the last files that will be created by the package. Internally, install_dmg checks for this file and mark the package as installed if it exists, skipping installation.
If you intend to use a package on multiple platforms always ensure to test on them before rolling out any manifest changes. When in doubt, create a package on each target platform.
Testing
Before you test on the Puppet server it's good to run the 'test-manifests.sh' scripts locally. This script will test the syntax of the manifest files and catch very basic issues. It will not catch any issues with run-time code such as Exec's.
Testing of updates is done with staging-puppet.build.mozilla.org and staging slaves. You should book staging-puppet as well as an slaves you intend to test on before making any changes to the manifests on the Puppet server. All Puppet server work is done as the root user.
Setting up the server
If you've never used the Puppet server before you'll want to start a clone of the manifests for yourself. You can clone the main manifests repo or your own user repo to a directory under /etc/puppet. Once you have your clone, two edits are necessary:
- Copy the password hash into your clone's build/cltbld.pp. This can be done with the following command, run from the root of your clone:
hg -R /etc/puppet/manifests.real diff /etc/puppet/manifests.real/build/cltbld.pp | patch -p1
or more easily
patch -p1 < /etc/puppet/password
- Comment out all of the "node" entries in staging.pp, except for those which you have booked.
If you have a patch to apply to the repository now is the time to do it.
Finally, if your changes involve edits to any files served by Puppet, apply those changes in the appropriate places under /N/staging.
Staging environments do not have the site.pp manifest. When testing in a staging environment, symlink site.pp to staging.pp with the following command:
ln -s staging.pp site.pp
Once all of that is done you can swap your manifests in by adjusting the symlink on /etc/puppet/manifests. If you've added new files or changed staging-fileserver.conf you'll need to restart the Puppetmaster process with:
service puppetmaster restart
Now, you're ready to test.
Testing a slave
Puppet needs to run as root on the slaves, so equip yourself thusly and run the following command:
puppetd --test --server staging-puppet.build.mozilla.org --logdest console --noop
This will pull updated manifests from the server, see what needs to be done, and output that. The --noop argument tells Puppet to not make any changes to the slave. Once you're satisfied with the output of that, you can run it without the --noop to have Puppet make the changes. The output should be coloured, and indicate success/fail/exception.
If you're encountering errors or weird behaviour and the normal output isn't sufficient for debugging you can enhance it with --evaltrace and --debug. Together, they will print out every command that Puppet runs, including things which are used to determine whether a file or package needs updating.
Forcing a package re-install
Especially when testing, you may have to iterate on a single package install to get it right. If you need to re-install an existing package, you'll need to remove the package contents and/or the marker file that flags that package as installed.
- Linux: packages installed as rpms should be removed as one normally would for an rpm, i.e.
rpm -e rpmname
, which will delete all of the files and remove the package from the db, orrpm -e --justdb rpmname
, which will leave all of the files and remove the package from the db - Mac: manually cleanup the installed files, and remove the marker file for your package. The marker file lives under
/var/db/
and will be named.puppet_pkgdmg_installed_pkgname.dmg
.
You can now re-test your package install with the command above, i.e. puppetd --test ...
.
Cleaning up
Once you're finished testing the manifests symlink needs to be re-adjusted with:
cd /etc/puppet rm manifests ln -s manifests.real manifests
Moving file updates to production
Production Puppet Masters:
- production-puppet.build.mozilla.org (aka mpt-production-puppet.build.mozilla.org)
- mv-production-puppet.build.mozilla.org
- scl-production-puppet.build.scl1.mozilla.com
NOTE: there are a lot of files that differ between the various directories, so using rsync involves a lot of whack-a-mole to avoid syncing files that aren't part of your change. It may be easier to simply use 'cp' for this step
When you're ready to land in production it's important to sync your files from staging to ensure you don't end up with a different result in production. Here's the process to do that. On production-puppet as root, run:
rsync -n --delete -av --include="**usr/local" --exclude=local /N/staging/ /N/production/
After verifying that only the things you want are being synced, run it without -n to push them for real:
rsync --delete -av --include="**usr/local" --exclude=local /N/staging/ /N/production/
If there are things that shouldn't be synced carefully adjust the rsync command with --exclude or more specific paths.
Once you've landed into /N/production on production-puppet, the other production puppet masters need to be updated: In theory, this is done as 'filesync', but that user does not have permission to update the relevant directories, so in practice I suspect it's done as root. Anyway, here's the example:
sudo su - filesync rsync -av --exclude=**/local/etc/sysconfig/puppet* --exclude=**/local/Library/LaunchDaemons/com.reductivelabs.puppet.plist* --exclude=**/local/home/cltbld/.config/autostart/gnome-terminal.desktop* --delete filesync@production-puppet.build.mozilla.org:/N/production/ /N/production/
again, rsync is finicky, so scp may be your friend here:
scp {root@production-puppet.build.mozilla.org:/N/production,/N/production}/darwin9-i386/build/Library/Preferences/com.apple.Bluetooth.plist
When you're ready, update the manifests on the masters with:
hg -R /etc/puppet/manifests pull hg -R /etc/puppet/manifests update
Note that some changes may require manifest updates first - think carefully about the intermediate state and what it will do to slaves!
Be sure to do this on all Puppet masters.
Moving slaves between staging/production
If you need to move slaves between staging and production, you'll need to delete the existing ssl certs on the slave so it can properly sync with the new puppet master. These certs can be found under /etc/puppet/ssl
on mac or /var/lib/puppet/ssl on linux.
Documentation/Links
Puppet has reasonably complete documentation, although navigating it can be a challenge.