Sheriffing/How To/Escalate: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 1: Line 1:
{{Sheriffing How To|Escalating issues to release engineering and the MOC}}
{{Sheriffing How To|Escalating issues to release engineering and the MOC}}
= Guidelines on escalating build farm events to Release Engineering and the MOC =
= Guidelines on escalating outages =


If an issue appears "global" -- outside of what RelEng can control like ftp.m.o or hg.m.o being down -- then going directly to the MOC can save a lot of time -- especially during off hours. The MOC can be reached in irc #moc. The MOC knows how to contact and escalate any issue.
== MOC ==
If you cannot pinpoint the affected system or team responsible, going directly to the MOC can save a lot of time, especially during off-hours. The MOC can be reached in irc #moc. The MOC knows how to contact and escalate any issue.


Other signs of a global issue include:
* you see nagios alerts for the issue in #sysadmins
* there's already a status (https://status.mozilla.org/) (but still let MOC know it's tree-impacting)
== Releng ==
If it's something RelEng can handle (or you also want them to be aware), here's the escalation path:
If it's something RelEng can handle (or you also want them to be aware), here's the escalation path:
* ping *|buildduty in #releng
* ping *|buildduty in #releng
* use the !squirrel stalk word in #releng
* use the !squirrel stalk word in #releng
* Follow the escalation route at https://wiki.mozilla.org/ReleaseEngineering#Contacting_Release_Engineering
* Follow the escalation route at https://wiki.mozilla.org/ReleaseEngineering#Contacting_Release_Engineering
== TaskCluster ==
* ping in #taskcluster (see https://wiki.mozilla.org/TaskCluster for coverage in any timezone)
* ping in #taskcluster (see https://wiki.mozilla.org/TaskCluster for coverage in any timezone)
Other signs of a global issue include:
* you see nagios alerts for the issue in #sysadmins
* there's already a status (https://status.mozilla.org/) (but still let MOC know it's tree impacting)

Revision as of 17:29, 27 November 2017

Guidelines on escalating outages

MOC

If you cannot pinpoint the affected system or team responsible, going directly to the MOC can save a lot of time, especially during off-hours. The MOC can be reached in irc #moc. The MOC knows how to contact and escalate any issue.

Other signs of a global issue include:

  • you see nagios alerts for the issue in #sysadmins
  • there's already a status (https://status.mozilla.org/) (but still let MOC know it's tree-impacting)

Releng

If it's something RelEng can handle (or you also want them to be aware), here's the escalation path:

TaskCluster