Services/Sync/Meetings/2010-09-29: Difference between revisions
Jump to navigation
Jump to search
Rsoderberg (talk | contribs) (→Ops) |
Rsoderberg (talk | contribs) (→Ops) |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 8: | Line 8: | ||
* Unexpected LDAP outage for MPT node | * Unexpected LDAP outage for MPT node | ||
** crashed at 6:59pm, deprecated by opening firewall from mradm02 to wp-slave01/2.phx.weave.m.c | ** crashed at 6:59pm, deprecated by opening firewall from mradm02 to wp-slave01/2.phx.weave.m.c | ||
* ec2 has 4 steps to complete | |||
** vpn to phx | |||
** ldap slave | |||
** webserver | |||
** database | |||
* followup from 2010-09-28 incidents: | * followup from 2010-09-28 incidents: | ||
** {{bug|600462}} production ldap approaching RAM limit, determine pre/post launch requirements and upgrade | ** {{bug|600462}} production ldap approaching RAM limit, determine pre/post launch requirements and upgrade | ||
** {{bug|600192}} automated weekly ldap checkpoint/cleanup | ** {{bug|600192}} automated weekly ldap checkpoint/cleanup | ||
** {{bug|600182}} hourly rotation of ldap logs | ** {{bug|600182}} hourly rotation of ldap logs | ||
** {{bug| | ** {{bug|600571}} ldap log verbosity investigation | ||
** {{bug|600562}} alert when sync db queries take more than 10s | ** {{bug|600562}} alert when sync db queries take more than 10s | ||
** {{bug|600216}} migration.php strands user when db connection fails | ** {{bug|600216}} migration.php strands user when db connection fails | ||
Line 26: | Line 31: | ||
** {{bug|591126}} root cause found, fix coming today | ** {{bug|591126}} root cause found, fix coming today | ||
** {{bug|600427}} filed to address redundant DELETE calls in intentional server wipe cases | ** {{bug|600427}} filed to address redundant DELETE calls in intentional server wipe cases | ||
=== Fx Home === | === Fx Home === | ||
Line 37: | Line 41: | ||
* Push last night rolled up stable branch (cull quotas, fix meta/global-in-memcache, other fixes) | * Push last night rolled up stable branch (cull quotas, fix meta/global-in-memcache, other fixes) | ||
* Work ongoing on node config DB for easier management of individual nodes | * Work ongoing on node config DB for easier management of individual nodes | ||
* followup from incidents on 2010-09-28: | |||
** {{bug|600208}} filed to send helpful errors and backoff to client when we take too long to query db/ldap | |||
** {{bug|600197}} update refresh_accounts to skip certain nodes as needed | |||
== Metrics == | == Metrics == |
Latest revision as of 16:23, 29 September 2010
- Time: Wednesday at 9:15AM PDT / 12:15PM EDT / 4:15 UTC.
- Place: Mozilla HQ, 3V (Very Good, Very Mighty)
- Phone (US/Intl): 650 903 0800 x92 Conf: 8616#
- Phone (Toronto): 416 848 3114 x92 Conf: 8616#
- Phone (US): 800 707 2533 (pin 369) Conf: 8616#
Technical
Ops
- push went well last night
- Unexpected LDAP outage for MPT node
- crashed at 6:59pm, deprecated by opening firewall from mradm02 to wp-slave01/2.phx.weave.m.c
- ec2 has 4 steps to complete
- vpn to phx
- ldap slave
- webserver
- database
- followup from 2010-09-28 incidents:
- bug 600462 production ldap approaching RAM limit, determine pre/post launch requirements and upgrade
- bug 600192 automated weekly ldap checkpoint/cleanup
- bug 600182 hourly rotation of ldap logs
- bug 600571 ldap log verbosity investigation
- bug 600562 alert when sync db queries take more than 10s
- bug 600216 migration.php strands user when db connection fails
Fx Sync
- l10n started Monday on 1.5, 11 locales green + fr is missing one string
- one small tweak coming today, Philipp/Mike will follow up on thread
- no blockers for Fx4 b7 (Desktop)
- a couple of blockers for Fennec b1, will have fixes this week
- followup from db29 issues:
- No smoking gun found from log analysis, will dig into some different users today, but some issues found
- bug 591126 root cause found, fix coming today
- bug 600427 filed to address redundant DELETE calls in intentional server wipe cases
Fx Home
- Ragavan working on product priorities
- mconnor working with Stefan on consistency with FxSync behaviour for new backend
Server
- Push last night rolled up stable branch (cull quotas, fix meta/global-in-memcache, other fixes)
- Work ongoing on node config DB for easier management of individual nodes
- followup from incidents on 2010-09-28:
- bug 600208 filed to send helpful errors and backoff to client when we take too long to query db/ldap
- bug 600197 update refresh_accounts to skip certain nodes as needed