Services/Sync/Meetings/2010-09-29: Difference between revisions

From MozillaWiki
< Services‎ | Sync‎ | Meetings
Jump to navigation Jump to search
 
(2 intermediate revisions by 2 users not shown)
Line 8: Line 8:
* Unexpected LDAP outage for MPT node
* Unexpected LDAP outage for MPT node
** crashed at 6:59pm, deprecated by opening firewall from mradm02 to wp-slave01/2.phx.weave.m.c
** crashed at 6:59pm, deprecated by opening firewall from mradm02 to wp-slave01/2.phx.weave.m.c
* ec2 has 4 steps to complete
** vpn to phx
** ldap slave
** webserver
** database
* followup from 2010-09-28 incidents:
* followup from 2010-09-28 incidents:
** {{bug|600462}} production ldap approaching RAM limit, determine pre/post launch requirements and upgrade
** {{bug|600462}} production ldap approaching RAM limit, determine pre/post launch requirements and upgrade
** {{bug|600192}} automated weekly ldap checkpoint/cleanup
** {{bug|600192}} automated weekly ldap checkpoint/cleanup
** {{bug|600182}} hourly rotation of ldap logs
** {{bug|600182}} hourly rotation of ldap logs
** {{bug|______}} ldap log verbosity investigation
** {{bug|600571}} ldap log verbosity investigation
** {{bug|600562}} alert when sync db queries take more than 10s
** {{bug|600562}} alert when sync db queries take more than 10s
** {{bug|600216}} migration.php strands user when db connection fails
** {{bug|600216}} migration.php strands user when db connection fails
Line 26: Line 31:
** {{bug|591126}} root cause found, fix coming today
** {{bug|591126}} root cause found, fix coming today
** {{bug|600427}} filed to address redundant DELETE calls in intentional server wipe cases
** {{bug|600427}} filed to address redundant DELETE calls in intentional server wipe cases
** {{bug|600208}} filed to send helpful errors and backoff to client when we take too long to query db/ldap


=== Fx Home  ===
=== Fx Home  ===
Line 37: Line 41:
* Push last night rolled up stable branch (cull quotas, fix meta/global-in-memcache, other fixes)
* Push last night rolled up stable branch (cull quotas, fix meta/global-in-memcache, other fixes)
* Work ongoing on node config DB for easier management of individual nodes
* Work ongoing on node config DB for easier management of individual nodes
* followup from incidents on 2010-09-28:
** {{bug|600208}} filed to send helpful errors and backoff to client when we take too long to query db/ldap
** {{bug|600197}} update refresh_accounts to skip certain nodes as needed


== Metrics  ==
== Metrics  ==

Latest revision as of 16:23, 29 September 2010

  • Time: Wednesday at 9:15AM PDT / 12:15PM EDT / 4:15 UTC.
  • Place: Mozilla HQ, 3V (Very Good, Very Mighty)
  • Phone (US/Intl): 650 903 0800 x92 Conf: 8616#
  • Phone (Toronto): 416 848 3114 x92 Conf: 8616#
  • Phone (US): 800 707 2533 (pin 369) Conf: 8616#

Technical

Ops

  • push went well last night
  • Unexpected LDAP outage for MPT node
    • crashed at 6:59pm, deprecated by opening firewall from mradm02 to wp-slave01/2.phx.weave.m.c
  • ec2 has 4 steps to complete
    • vpn to phx
    • ldap slave
    • webserver
    • database
  • followup from 2010-09-28 incidents:
    • bug 600462 production ldap approaching RAM limit, determine pre/post launch requirements and upgrade
    • bug 600192 automated weekly ldap checkpoint/cleanup
    • bug 600182 hourly rotation of ldap logs
    • bug 600571 ldap log verbosity investigation
    • bug 600562 alert when sync db queries take more than 10s
    • bug 600216 migration.php strands user when db connection fails

Fx Sync

  • l10n started Monday on 1.5, 11 locales green + fr is missing one string
  • one small tweak coming today, Philipp/Mike will follow up on thread
  • no blockers for Fx4 b7 (Desktop)
  • a couple of blockers for Fennec b1, will have fixes this week
  • followup from db29 issues:
    • No smoking gun found from log analysis, will dig into some different users today, but some issues found
    • bug 591126 root cause found, fix coming today
    • bug 600427 filed to address redundant DELETE calls in intentional server wipe cases

Fx Home

  • Ragavan working on product priorities
  • mconnor working with Stefan on consistency with FxSync behaviour for new backend

Server

  • Push last night rolled up stable branch (cull quotas, fix meta/global-in-memcache, other fixes)
  • Work ongoing on node config DB for easier management of individual nodes
  • followup from incidents on 2010-09-28:
    • bug 600208 filed to send helpful errors and backoff to client when we take too long to query db/ldap
    • bug 600197 update refresh_accounts to skip certain nodes as needed

Metrics

Product

Roundtable

Notes and actions

Fx 4 client

Fx 4 server

Other issues