ReleaseEngineering/Puppet/Usage

From MozillaWiki
Jump to navigation Jump to search
Warning signWarning: This page documents (mostly) the old release engineering puppet deployment. See ReleaseEngineering/PuppetAgain for documentation of the current deployment.
Puppet: Usage | Server Setup | Client Setup | Links | Troubleshooting

This document is intended to serve as a guide to interacting with our Puppet servers and manifests.

Definitions

  • Type - Puppet documentation talks a lot about this. Each different "type" deals with a different aspect of the system. For example, the "user" type can do most things related to user management (passwords, UID/GID, homedirs, shells, etc). The 'package' type deals with package management (eg, apt, rpm, fink, etc). And so on.

Masters

  • staging-puppet.build.mozilla.org (staging, in SCL3)
  • mv-production-puppet.build.mozilla.org (MV)
  • scl-production-puppet.build.scl1.mozilla.com (SCL1)
  • scl3-production-puppet.srv.releng.scl3.mozilla.com (SCL3)
  • master-puppet1.build.mozilla.org (for buildbot-masters, in SCL1)

The Slave-Master Link

You can find to which puppet master a slave connects to by checking this file's contents:

# for linux testers (fedora)
~cltbld/.config/autostart/gnome-terminal.desktop
# for linux builders (centos)
/etc/sysconfig/puppet
# for osx
/Library/LaunchDaemons/com.reductivelabs.puppet.plist

If the slaves have to be moved between masters be sure to remove the certs after you modify this file and before their next reboot. You may also need to do 'puppetca --clean <FQDN>' on the new puppet master.

# for linux
find /var/lib/puppet/ssl -type f -delete
# for mac
find /etc/puppet/ssl -type f -delete

Our Puppet Manifests

Out puppet manifests are organized into a few different parts:

  • Site files
  • Basic includes
  • Packages that make changes
  • Modules

We are pushing toward organizing everything into modules, although this is not a particularly rapid process at the moment. Talk to Dustin.

Site Files & Basic Includes

Each Puppet master has its own site file which contains a few things:

  • Variable definitions specific to that master
  • Import statements which load other parts of the manifests
  • Node (slave) definitions

The basic includes are located in the 'base' directory. These files set variables with are referenced in the packages as well as base nodes for slaves.

The most important variables to take note of are:

  • ${platform_fileroot} -- Used wherever the puppet:// protocol is supported, most notably with the File type.
  • ${platform_httproot} -- Used with the Package type and other places that don't support puppet://

There are also ${local_[file,http]} variables which point to the 'local' directory inside of each platform's root. See the following section for more on that.

We have a few base nodes shared by multiple pools of slaves as well as a base node for each concrete slave type. The shared ones are:

  • "slave" -- For things common to ALL slaves managed by Puppet
  • "build" -- For things common to all build slaves
  • "test" -- For things common to all test slaves

There are two different types of concrete nodes. Firstly, we have $platform-$arch-$type" nodes, which are used on all Puppet masters for slaves which are local to them. Two example are: "centos5-i686-build" (32-bit, CentOS 5, build slaves) and "darwin10-i386-test" (32-bit, Mac 10.6, test slaves). Secondly, there are "$location-$type-node" nodes, which only apply to the MPT master. All nodes which are not local to MPT production are listed in its configuration file as this type of node. These nodes ensure that new slaves get redirected to their local master when they first come up. Examples include "mv-build-node" and "staging-test-node".

See base/nodes.pp for the full listing of nodes..

Packages

  • The site-{staging,production}.pp files declare the list of slaves and each slave has defined which classes to include.
  • The classes buildslave.pp and staging-buildslave.pp include most of the packages (devtools, nagios, mercurial, buildbot, extras, etc) we want.
  • The packages can have different sections or "Types" that can be "exec", "user", "package", "file", "service"

Modules

Going forward, puppet functionality should be encapsulated into modules. Modules include the relevant manifests, as well as files, templates, and (with some minor changes to our puppet client configs) even custom facts or types!

Modules should be generic in their purpose, and well-encapsulated. They should not be specific to one operating system or distro by design, although it's OK to omit implementations we do not need (for example, it's OK for a module providing resources only used by build slaves to error out if it's used on a Fedora system - if and when we start building on Fedora, we'll need to extend the implementation).

A module should be self-contained and have a well-documented and commented interface. If it depends on any other modules, that should also be highlighted in the comments.

Puppet Files

The files that Puppet serves up (using File) are in /N on each puppet master. The MPT masters share this via an NFS mount, so it's easy to sync files from staging to MPT production. The other servers have a local copy of this data.

That first 3 levels of the drive are laid out as follows:

$level/$os-$hardwaremodel/$slaveType
  • $level is support level (production, staging, pre-production)
  • $os is generally one of 'centos5', 'fedora12', 'darwin9', or 'darwin10'.
  • $hardwaremodel is whatever 'facter' identifies the machine's CPU as (x86_64, i686, i386, etc).
  • $slaveType is the "type" of node of the slave is: 'build', 'test', 'stage', 'master', etc.

Below '$type', are all of the files served by Puppet. They are organized according to where they'll end up on the slave. For example, if /usr/lib/libsomethinghuge.so is to be synced to the slave, it should live in usr/lib/libsomethinghuge.so. Note that as much as possible, text files should not be kept in puppet-files -- use a module and its files/ subdirectory instead.

There are two special directories for each level/os/hardwaremodel/type combination, too:

  • local -- This directory contains files which should NOT be synced between staging <-> production or between different locations. Files such as the Puppet configs which have different contents depending on location and support level live here. Try not to use this.
  • DMGs (Mac) / RPMs (Fedora/CentOS) -- These directories contain platform specific packages which Puppet installs.

Common Use Cases

Testing

Before you test on the Puppet server it's good to run the 'test-manifests.sh' scripts locally. This script will test the syntax of the manifest files and catch very basic issues. It will not catch any issues with run-time code such as Exec's. This should really be a Makefile - bug 635067

Staging of updates is done with staging-puppet.build.mozilla.org and staging slaves. You should book staging-puppet as well as an slaves you intend to test on before making any changes to the manifests on the Puppet server. All Puppet server work is done as the root user.

Setting up the server

If you've never used the Puppet server before you'll want to start a clone of the manifests for yourself. You can clone the main manifests repo or your own user repo to a directory under /etc/puppet. Once you have your clone, two edits are necessary:

  • Copy the password hash into your clone's build/cltbld.pp. This can be done with the following command, run from the root of your clone:
hg -R /etc/puppet/manifests.real diff /etc/puppet/manifests.real/build/cltbld.pp | patch -p1

or more easily

patch -p1 < /etc/puppet/password
  • Copy staging.pp to site.pp and comment out all of the "node" entries except for those which you have booked.

It's easiest to use the mq extension to make these changes in a patch on your queue. Then, when you want to change revisions, just pop the patch, use 'hg pull -u', and re-push your patch.

If you have a patch to apply to the repository now is the time to do it.

Finally, if your changes involve edits to any files served by Puppet, apply those changes in the appropriate places under /N/staging. It's usually easiest to keep a text file tracking these changes - then you can post the contents of that file to the bug for review, so that it's clear to reviewers what changes are being made here. Because puppet-files are unversioned, try to minimize the amount of change you must make here.

Once all of that is done you can swap your manifests in with /etc/puppet/set-manifests.sh YOURNAME. Omit the name to reset them to the default ("real") manifests. If you've added new files or changed staging-fileserver.conf you'll need to restart the Puppetmaster process with:

service puppetmaster restart

although note that the daemon will pick up the changes after some short delay if you do not restart.

Now, you're ready to test.

Testing a slave

Puppet needs to run as root on the slaves, so equip yourself thusly and run the following command:

puppetd --test --logdest console --noop --server staging-puppet.build.mozilla.org

This will pull updated manifests from the server, see what needs to be done, and output that. The --noop argument tells Puppet to not make any changes to the slave. Once you're satisfied with the output of that, you can run it without the --noop to have Puppet make the changes. The output should be coloured, and indicate success/fail/exception.

If you're encountering errors or weird behaviour and the normal output isn't sufficient for debugging you can enhance it with --evaltrace and --debug. Together, they will print out every command that Puppet runs, including things which are used to determine whether a file or package needs updating.

Forcing a package re-install

Especially when testing, you may have to iterate on a single package install to get it right. If you need to re-install an existing package, you'll need to remove the package contents and/or the marker file that flags that package as installed.

  • Linux: packages installed as rpms should be removed as one normally would for an rpm, i.e. rpm -e rpmname, which will delete all of the files and remove the package from the db, or rpm -e --justdb rpmname, which will leave all of the files and remove the package from the db
  • Mac: manually cleanup the installed files, and remove the marker file for your package. The marker file lives under /var/db/ and will be named .puppet_pkgdmg_installed_pkgname.dmg.

You can now re-test your package install with the command above, i.e. puppetd --test ....

Cleaning up

Once you're finished testing the manifests symlink needs to be re-adjusted with:

cd /etc/puppet
./set-manifests.sh

Moving file updates to production

Production Puppet Masters:

  • mv-production-puppet.build.mozilla.org
  • scl-production-puppet.build.scl1.mozilla.com
  • scl3-production-puppet.srv.releng.scl3.mozilla.com
  • master-puppet1.build.scl1.mozilla.com

NOTE: there are a lot of files that differ between the various directories, so using rsync involves a lot of whack-a-mole to avoid syncing files that aren't part of your change. It may be easier to simply use 'cp' for this step

When you're ready to land in production it's important to sync your files from staging to ensure you don't end up with a different result in production. Here's the process to do that. On scl3-production-puppet as root, run:

rsync -n --delete -av --include="**usr/local" --exclude=local /N/staging/ /N/production/

After verifying that only the things you want are being synced, run it without -n to push them for real:

rsync --delete -av --include="**usr/local" --exclude=local /N/staging/ /N/production/

If there are things that shouldn't be synced carefully adjust the rsync command with --exclude or more specific paths.

Once you've landed into /N/production on scl3-production-puppet, the other production puppet masters need to be updated: In theory, this is done as 'filesync', but that user does not have permission to update the relevant directories, so in practice I suspect it's done as root. Anyway, here's the example:

sudo su - filesync
rsync -av --exclude=**/local/etc/sysconfig/puppet* --exclude=**/local/Library/LaunchDaemons/com.reductivelabs.puppet.plist* --exclude=**/local/home/cltbld/.config/autostart/gnome-terminal.desktop* --delete  filesync@scl3-production-puppet.build.mozilla.org:/N/production/ /N/production/

again, rsync is finicky, so scp may be your friend here:

 # mv-production-puppet  
 scp -p {root@scl3-production-puppet.build.mozilla.org:/N/production,/N/production}/darwin9-i386/build/Library/Preferences/com.apple.Bluetooth.plist

 # scl-production-puppet (bug 615313)
 scp -p {root@scl3-production-puppet.build.mozilla.org:/N/production,/builds/production}/darwin9-i386/build/Library/Preferences/com.apple.Bluetooth.plist

When you're ready, update the manifests on the masters with:

hg -R /etc/puppet/manifests pull
hg -R /etc/puppet/manifests update

Note that some changes may require manifest updates first - think carefully about the intermediate state and what it will do to slaves!

Be sure to do this on all Puppet masters.

Staging changes (environments)

armenzg: if you know of a script or a command that could catch stupid things like this
dustin: I used to use environments for this purpose
armenzg: what do you mean?
armenzg: what are environments?
dustin: you can specify a different envrionment on the client:
dustin: puppetd --test --environment=dustin
dustin: and then that can be configured to point to a different directory on the master
dustin: so I would push my mq'd repo there
dustin: and test with it, confident that only the slave I was messing with would be affected
catlee: armenzg: we have that set up on master-puppet1 if you want to look

Deploy changes

csshX --login root {mv-production-puppet,scl3-production-puppet,scl-production-puppet}.build.mozilla.org
    • be sure that the files are across all masters or the whole set of slaves will be going down
  • make sure you deploy the changes to all puppet masters (ssh as root)
    • scl-production-puppet
    • scl3-production-puppet
    • mv-production-puppet
    • master-puppet1
  • cd /etc/puppet/manifests/
  • hg pull -u
  • watch few minutes that there are no errors
    • tail -F /var/log/messages
    • once you see a slave listed go and check to see that it got the changes

Current Puppet Servers

An accurate list of puppet servers needs to be referenced by various procedures. Please keep the following list up to date.

Role Data Center Slave Puppet Master
build master all master-puppet1.build.mozilla.org
build slave scl1 scl-production-puppet.build.scl1.mozilla.com
build slave scl3 scl3-production-puppet.srv.releng.scl3.mozilla.com
build slave mtv1 mv-production-puppet.build.mtv1.mozilla.com