ReleaseEngineering:Sheriffing:HowTo

Return to ReleaseEngineering:Sheriffing

This page serves as a clearinghouse of information on how to perform the various duties associated with buildduty.

Try Server

How do I trigger a talos run for a given try build?

  • When someone pings you in #build with their try run dir name (format: $email-$changeset eg: lsblkk@mozilla.org-4asf23fsd251d):
  • Either:
    • ssh into production-master{01,02,03}
    • OR run from your machine tools/buildfarm/maintenance/try_sendchange.py
      • on the production-masters, there is a ~/try_sendchange.sh wrapper script which uses argparse in /tools/buildbot-0.8.0/bin/python

Then run:

./try_sendchange.sh $email-$changeset
# OR to do custom set of talos suites
./try_sendchange.sh $email-$changeset --t scroll,svg,nochrome
# NOTICE no spaces between comma-separated suite names!
  • It will spew back to you all the sendchanges it does.

How do I cancel existing jobs?

on pm02, 'history | grep cancell' for sample usage

To loan slaves

  • change cltbld's and root's password (passwd)
  • change vnc's password (Linux: vncpasswd / Windows: UltraVNC server "admin properties" on bottom right task bar / OSX: Control Panel -> Sharing)
  • disable buildbot from running after reboot (rename buildbot.tac / rename startTalos.bat for Windows)
  • [only for build slaves] remember to remove all .ssh keys
  • provide to developer the IP address, cltbld's password and VNC's password

To change autologin

  • start -> run -> control userpasswords2
    • (on w7, start -> Search programs and files -> netplwiz)
  • check the option “Users must enter a user name and password to use this computer”
  • apply
  • uncheck the option “Users must enter a user name and password to use this computer”
  • apply
  • account: cltbld, enter new password twice

Dealing with machines

  • The current downtime bug should always be aliased as "releng-downtime": http://is.gd/cQO7I
  • The current machine reboots bug should always be aliased as "reboots": http://is.gd/dqSV0

Mobile

n810s

Once a device hits a hard state (100% of retries), it is dead. Please use this template to file a new bug with the device names.

  • 8 devices per bug max
  • if the newest open reimage bug has less then 8 devices, please add to it until it has 8
  • once the newest bug has 8 device in it open a new bug
  • any bug that is resolved should not have any devices added to it

Nagios

Coordinate downtime with IT

  • Some IT maintenance requires tree closure. Details here: ReleaseEngineering:RelEngITSharedDowntime
  • If possible, consolidate RelEng and IT downtimes that need tree closures to avoid having two tree closures soon after each other. This is "nice to do", not a "requirement"; if it reduces risk by doing two separate downtimes, thats fine!