Releases/Firefox 3.0.6/Post Mortem: Difference between revisions
< Releases | Firefox 3.0.6
Jump to navigation
Jump to search
(→IT) |
(→IT) |
||
Line 11: | Line 11: | ||
== IT == | == IT == | ||
# | # Bouncer slave database was disabled, causing updates to fail when the master was under high load ({{bug|476753}}). Had to pull/push updates a few times while debugging. | ||
## Root Cause: <code>tm-bouncer01-slave02</code> was disabled in the load balancer during a previous maintenance window and never re-enabled. | |||
## Actions: | |||
### Need to monitor backend service status on the load balancer ({{bug|476764}}) in the same way we monitor origin web servers. Nagios would have alerted after maintenance window that the second slave was missing. | |||
### Bouncer needs three databases to withstand a failure of one during release ({{bug|477183}}). | |||
# Had to throttle bits after release ({{bug|476875}}) because mirrors couldn't handle load. Why? How do we ensure this doesn't happen in the future? | # Had to throttle bits after release ({{bug|476875}}) because mirrors couldn't handle load. Why? How do we ensure this doesn't happen in the future? | ||
== Websites == | == Websites == | ||
# [issue here] | # [issue here] |
Revision as of 04:17, 6 February 2009
The following are the meeting notes from the Firefox 3.0.6 post-mortem held on Friday, February 6, 2009 at 12:00pm PST.
Development
- [issue here]
QA
- [issue here]
Build
- [issue here]
IT
- Bouncer slave database was disabled, causing updates to fail when the master was under high load (bug 476753). Had to pull/push updates a few times while debugging.
- Root Cause:
tm-bouncer01-slave02
was disabled in the load balancer during a previous maintenance window and never re-enabled. - Actions:
- Need to monitor backend service status on the load balancer (bug 476764) in the same way we monitor origin web servers. Nagios would have alerted after maintenance window that the second slave was missing.
- Bouncer needs three databases to withstand a failure of one during release (bug 477183).
- Root Cause:
- Had to throttle bits after release (bug 476875) because mirrors couldn't handle load. Why? How do we ensure this doesn't happen in the future?
Websites
- [issue here]