CloudServices/Sync/FxSync/Syncorro
< CloudServices | Sync | FxSync
Socorro + Sync = Syncorro \o/
Goals
- Gather statistics on errors (to help with prioritization)
- Be able to correlate errors with maintenance windows, user profiles, etc.
- Simplify error reporting for users who file bugs or SUMO articles
- Detect the "long tail" of problems that are never filed
Features
- Each submitted report should be represented by a URL or at least an opaque token (e.g. UUID)
- Ability to query according to application, Sync, and error specific metadata
- Fulltext search over submitted log data
- Ability to return instructions to client upon report submission (e.g. throttling, recovery, support messages for the user, etc.)
Client UX
Submitting an error report
- When there's a Sync error, the usual error bar is shown (except we're not showing the dreadful Unknown Error message):
------------------------------------------------------------------------ | We're sorry, Sync encountered an error [Details /\] (X)| ------------------------------------------------------------------------
- Clicking on the Details button slides out the Details drawer with a high-level explanation of the details of the error:
------------------------------------------------------------------------ | We're sorry, Sync encountered an error [Details \/] (X)| ------------------------------------------------------------------------ | There was a problem saving the "bla bla bla" bookmark to your | | computer. Other data is not affected. | | | | To help us improve Sync and prevent errors like this in the future, | | please submit this report. Your personal data will not be submitted. | | | | [ ] Automatically submit reports in the future. | | | | [Submit report] | ------------------------------------------------------------------------
- Clicking the X button on the right will dismiss the bar (including the drawer if it's visible).
------------------------------------------------------------------------ | We're sorry, Sync encountered an error [Details \/] (X)| ------------------------------------------------------------------------ | There was a problem saving the "bla bla bla" bookmark to your | | computer. Other data is not affected. | | | | To help us improve Sync and prevent errors like this in the future, | | please submit this report. Your personal data will not be submitted. | | | | [ ] Automatically submit reports in the future. | | | | Submitting report... ( ) | <--- this is a throbber ------------------------------------------------------------------------
- If the Syncorro server finds a suitable support page, the drawer will change to this:
------------------------------------------------------------------------ | We're sorry, Sync encountered an error [Details \/] (X)| ------------------------------------------------------------------------ | Good news! We analyzed the report and found a possible solution to | | the problem you're seeing. | | | | _View_support_page_ | ------------------------------------------------------------------------
Looking up error reports
- Make about:sync-log look like about:crashes
Client Implementation
Note: This is only a draft that is being fleshed out.
- Using Metric's Elastic Search system (also used for AMO stats and Socorro) at data.mozilla.org
- On error, Sync POSTs a payload to data.mozilla.org:
PUT /XXX HTTP/1.1 Content-Type: application/json
{ id: "{UUID}", app: { product: "{UUID}", version: "8.0a1", buildID: "...", locale: "en_US", addons: ["{UUID}", "{UUID}", "{UUID}", ...] }, sync: { version: "1.10", account: "eisklclxuauemrjghidis", cluster: "https://phx-sync091.services.mozilla.com/", engines: ["bookmarks", "history", ...], numClients: 2, mobileClients: true }, error: { engine: "bookmarks", result: 489294595, // the error constant if applicable }, log: "..." }
- Under normal conditions, the server returns HTTP 200 OK with optional hints for the client concerning throttling and help for the user:
HTTP/1.1 200 OK Content-Type: application/json
{ throttle: 10, // only submit every 10th error infoURL: "http://support.mozilla.com/..." }
- Server can also return other status codes to indicate that the data wasn't accepted.
- 500 Server Error
- XXX throttled, try again later
- XXX invalid data
Dashboard implementation
XXX TODO
Questions
- Should this be
- always on and not configurable, or
- ask after every error (with the option to remember the user's choice for future incidents),
- off by default and opt in, or
- on by default and opt out?
- Will this service require ToS changes?
- What do we do with custom server users?
- What do we do when user has Trace logging enabled?