ReferencePlatforms/How To/Setup a New Reference Platform

Congratulations. You have been chosen to setup a new reference platform. Armen summarized this journey as "It will be difficult". In addition to testing the new image on the test or build machines, there are several other steps that must be taken to ensure that our build infrastructure is read to work with the new platform and the associated slaves. There are several tasks that you can do ahead of time to make it easier noted in the checklist below. Many of these involve IT, so open bugs accordingly.

Do you need a new master?

  • Do you have enough capacity on your master to accommodate the new slaves?
    • roughly, testing masters should have no more than 80 slaves (these days) (Todo: how to check what slaves are assigned to each master)
  • Does your master reside in the same data center as your new slaves?
    • If not, you should set up a new test master so both slaves and master reside in the same data center.

If so, open bugs for the new master

If the answer is no to any of the questions above, you'll need to setup a new master, unless there unused masters that are already provisioned. Open a bug with IT to bring up some VMs where you can install a new master (example: bug 782870. Read ReleaseEngineering/Master_Setup to understand the steps to setup a new master. This document also describes some bugs that need to be opened with various teams when setting up the new master, so read it now.

Open a bug to establish network flows to the sql server from the new master

  • Open a bug (Server Operations::ACL Request) for network flows from your new master(s) to the sql server to example bug 783055

Open bugs so you can puppetize the new master(s)

This is a releng bug Example bug 783455

Are you able to send mail to the tinderbox server from the new master?

bug 717808 is an example. Tested this tonight from bm37, think it works

Aug 19 16:47:06 buildbot-master38 sendmail[17861]: q7JNl5wW017859: to=<dm-mail01@tinderbox.mozilla.org>, ctladdr=<root@buildbot-master38.srv.releng.scl3.mozilla.com> (0/0), delay=00:00:01, xdelay=00:00:01, mailer=esmtp, pri=120434, relay=mx1.corp.phx1.mozilla.com. [63.245.216.69], dsn=5.1.1, stat=User unknown
Aug 19 16:47:06 buildbot-master38 sendmail[17861]: q7JNl5wW017859: q7JNl6wW017861: DSN: User unknown
Aug 19 16:47:06 buildbot-master38 sendmail[17861]: q7JNl6wW017861: to=<root@buildbot-master38.srv.releng.scl3.mozilla.com>, delay=00:00:00, xdelay=00:00:00, mailer=local, pri=31748, dsn=2.0.0, stat=Sent


Open a bug for buildbot and puppet changes

This should be against releng. Example {bug|777759}. These will be changes to buildbotcustom, buildbot-configs and puppet-manifests to support the new platform. The buildbot-configs patch will have to wait to be released until all your testing is complete and the platform is ready to go before it can be released into a reconfig. The buildbotcustom and puppet-manifests changes can be released at any time.

Open a bug for graph server changes

  • testing machines and each type of build need graph server changes
    • graph server work needs to be run on staging and production graph server
    • need to land changes to 'sql/data.sql' on the 1.0 branch of http://hg.mozilla.org/graphs. Example bug 783660 This bug should be approved opened in the webdev bucket but the patch should be approved by a releng person.
    • If this is a new build platform, make sure that graph server knows about the build platform
    • Also, you'll need to open a bug with Server Operations: Database to add the new machines to the graphs.mozilla.org and graphs.allizom.org. Example bug 784330

Open a bug for tbpl changes

    • tbpl needs to be patched to reflect the new platform bug 782826 is an example of this change

Slavealloc changes and cnames for slaves

  • Open a bug with IT for the cnames for the slaves. Example bug 782870
  • Add the new slaves to slavealloc
  • Add the slave password to the slave_passwords table for the appropriate poolid and distro

On the client

(perhaps this should be moved to another section)
  • disable screensaver
  • disable power savings
  • test that resolution meets requirements set forth by devs. This may require a dongle.

Testing environment

  • Once your slaves are in slavealloc, tie them to your development master. Patch your development master so it includes the changes to accommodate the new platform. Set up your development master to test the new slaves and ensure that the tests run successfully for that platform.

Things to check for when running the tests:

    • That the machines reboot after the tests complete
    • That they can input results into the graphing server...more to add

Moving new platform into production

  • After testing is complete, you should be ready to move your new platform into production.
    • Enable slaves and masters in slavealloc
    • Reconfig to enable changes to buildbot-configs and make the platform available for tests