CloudServices/Sync/FxSync/DeathToUnknownError: Difference between revisions
< CloudServices | Sync | FxSync
Jump to navigation
Jump to search
Rsoderberg (talk | contribs) (→Goal: no error bars for ops maintenance either.) |
Rsoderberg (talk | contribs) (→Proposal: more about ops) |
||
Line 11: | Line 11: | ||
** Instead warn user if they haven't synced successfully for a while (e.g. a week). | ** Instead warn user if they haven't synced successfully for a while (e.g. a week). | ||
* Def.: "Syncing successfully" = syncing without any *new* errors (errors that we've reported in the past should not resurface) | * Def.: "Syncing successfully" = syncing without any *new* errors (errors that we've reported in the past should not resurface) | ||
* Operations must be able to send a response that does not escalate for some time | |||
** ops uses 503+retry-after to indicate "work underway" which escalates immediately, 100% of the time, and is thus a poor solution for planned maintenance events | |||
** currently the best (and only) way to indicate "work underway, try again later" is to hard-close the connection without sending a response. | |||
** typically outages are no more than 15-20 minutes, so the client escalates SHOULD escalate after 5-10 consecutive errors | |||
* Kill Unknown Error | * Kill Unknown Error | ||
** Mention which engine the error occurred in | ** Mention which engine the error occurred in | ||
** If we know, mention what part of process the error occurred in (download, upload, etc.) | ** If we know, mention what part of process the error occurred in (download, upload, etc.) | ||
** Whereever we throw, throw a meaningful value that can be turned into an l10n | ** Whereever we throw, throw a meaningful value that can be turned into an l10n |
Revision as of 23:19, 11 May 2011
Goal
- No annoying error bars for recoverable errors (e.g. network), especially a requirement for Instant Sync
- No annoying error bars for recurring errors that have already been acknowledged but can't be solved without a software update
- No annoying error bars during planned server maintenance windows.
- Be more informative about problems when an error occurs ("Unknown Error" is not very helpful and doesn't help when users report bugs).
Proposal
- Network errors should NEVER escalate
- ensure that all network failures actually end up setting the right bits here, cf bug 624436
- Instead warn user if they haven't synced successfully for a while (e.g. a week).
- Def.: "Syncing successfully" = syncing without any *new* errors (errors that we've reported in the past should not resurface)
- Operations must be able to send a response that does not escalate for some time
- ops uses 503+retry-after to indicate "work underway" which escalates immediately, 100% of the time, and is thus a poor solution for planned maintenance events
- currently the best (and only) way to indicate "work underway, try again later" is to hard-close the connection without sending a response.
- typically outages are no more than 15-20 minutes, so the client escalates SHOULD escalate after 5-10 consecutive errors
- Kill Unknown Error
- Mention which engine the error occurred in
- If we know, mention what part of process the error occurred in (download, upload, etc.)
- Whereever we throw, throw a meaningful value that can be turned into an l10n