Sheriffing/How To/Escalate: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
m (removed google doc link because there were too many random requests for access.)
 
(5 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{Sheriffing How To|Escalating issues to release engineering and the MOC}}
{{Sheriffing How To|Escalating issues to release engineering and the MOC}}
= Guidelines on escalating build farm events to Release Engineering and the MOC =
= Guidelines on escalating outages =


If an issue appears "global" -- outside of what RelEng can control like ftp.m.o or hg.m.o being down -- then going directly to the MOC can save a lot of time -- especially during off hours. The MOC can be reached either in irc #moc, or at moc@mozilla.com. The MOC knows how to contact and escalate any issue.
== RelOps ==
Escalate infrastructure / issues with the continuous integration (CI) in #ci on IRC. Request list of best suited contacts from Aryx.


Other signs of a global issue include:
* you see nagios alerts for the issue in #sysadmins
* there's already a status (https://status.mozilla.org/) (but still let MOC know it's tree-impacting)
== Releng and Ciduty ==
If it's something RelEng can handle (or you also want them to be aware), here's the escalation path:
If it's something RelEng can handle (or you also want them to be aware), here's the escalation path:
* ping *|buildduty in #releng
* use the !squirrel stalk word in #releng
* Follow the escalation route at https://wiki.mozilla.org/ReleaseEngineering#Contacting_Release_Engineering
* Follow the escalation route at https://wiki.mozilla.org/ReleaseEngineering#Contacting_Release_Engineering
== TaskCluster ==
* ping in #taskcluster (see https://wiki.mozilla.org/TaskCluster for coverage in any timezone)
* ping in #taskcluster (see https://wiki.mozilla.org/TaskCluster for coverage in any timezone)
Other signs of a global issue include:
* you see nagios alerts for the issue in #sysadmins
* there's already a whistle pig notification (https://whistlepig.mozilla.org/) (but still let MOC know it's tree impacting)

Latest revision as of 13:38, 7 July 2019

Guidelines on escalating outages

RelOps

Escalate infrastructure / issues with the continuous integration (CI) in #ci on IRC. Request list of best suited contacts from Aryx.

Other signs of a global issue include:

  • you see nagios alerts for the issue in #sysadmins
  • there's already a status (https://status.mozilla.org/) (but still let MOC know it's tree-impacting)

Releng and Ciduty

If it's something RelEng can handle (or you also want them to be aware), here's the escalation path:

TaskCluster