ReleaseEngineering/NoReboots: Difference between revisions

Fill in links to source code.
(Fill in links to source code.)
Line 48: Line 48:
   summary:    Bug 1103123 - Turn off rebooting of talos machines; r=catlee
   summary:    Bug 1103123 - Turn off rebooting of talos machines; r=catlee


==How is no-reboot mode enabled (idllizer, post_flight)?==
==How no-reboot mode is enabled (idllizer, post_flight)==


Buildbot is now started/managed by runner, which runs tasks in an infinite loop according to some specified order; each task is blocking. As such, buildbot initiates a graceful shutdown immediately after accepting any job so that the runner tasks may loop around again after it’s finished. A single runner loop looks like this:
Buildbot is now started/managed by [[ReleaseEngineering/Applications/Runner|runner]], which runs tasks in an infinite loop according to some specified order [each task is blocking]. As such, buildbot initiates a graceful shutdown immediately after accepting any job so that the runner tasks may loop around again after it’s finished. A single runner loop looks like this:


<tasks before buildbot> -> buildbot.py [graceful shutdown] -> <tasks after buildbot> -> post_flight.py
    <tasks before buildbot> -> buildbot.py [graceful shutdown] -> <tasks after buildbot> -> post_flight.py


The graceful shutdown is initiated by idliizer, then, post_flight.py decides whether or not to shut down the machine or go forward with another loop.
The graceful shutdown is initiated by [http://hg.mozilla.org/build/buildbot/file/09de3a58d602/slave/buildslave/idleizer.py#l176 idelizer.py], then, [http://hg.mozilla.org/build/puppet/file/1185781bb6c1/modules/runner/files/post_flight.py post_flight.py] decides whether or not to shut down the machine or go forward with another loop.


===post_flight checks:===
===post_flight checks:===


====hostname blacklist====
=====hostname blacklist=====


Any machine with a hostname that matches some regular expression found in this list will be rebooted by post_flight. For example: [“^tst-“, “^t-"] would reboot all test machines after any job.
Any machine with a hostname that matches some regular expression found in this list will be rebooted by post_flight. For example: [“^tst-“, “^t-"] would reboot all test machines after any job.


====build api====
=====build api=====


BuildAPI is used to fetch data about the most recent job, if the job fails the slave is rebooted.
BuildAPI is used to fetch data about the most recent job, if the job fails the slave is rebooted.


====todo: jobname blacklist====
=====todo: jobname blacklist=====


Works like the hostname blacklist, except, acting on the name of the most recently run job (which is known about by BuildAPI).
Works like the hostname blacklist, except, acting on the name of the most recently run job (which is known about by BuildAPI).
78

edits