CloudServices/Sync/FxSync/Syncorro: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
 
(25 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Socorro]] + Sync = Syncorro \o/
[[Socorro]] + Sync = Syncorro \o/
= People =
* Client engineering: Marina Samuel, Philipp von Weitershausen
* Server engineering: XXX
* Metrics: Daniel Einspanjer, Xavier Stevens
* Product: Jennifer Arguello


= Goals =  
= Goals =  
Line 15: Line 22:
* Ability to return instructions to client upon report submission (e.g. throttling, recovery, support messages for the user, etc.)
* Ability to return instructions to client upon report submission (e.g. throttling, recovery, support messages for the user, etc.)


= Implementation =
= Roadmap =
 
* Discuss goals and features with metrics (DONE)
* Discuss UI mockups with UX
* Add ability to upload Syncorro data to ElasticSearch (see {{bug|673318}})
* Build add-on for the Services Beta Channel
 
= Client UX =
 
== Submitting an error report ==
 
* When there's a Sync error, the usual error bar is shown (except we're not showing the dreadful Unknown Error message):
 
  -------------------------------------------------------------------------
  | We're sorry, Sync encountered a problem [Details ...]              (X)|
  -------------------------------------------------------------------------
 
* Clicking on the ''Details'' button dismisses the bar and brings up a tab with a high-level explanation of the details of the error:
 
  ------------------------------------------------------------------------
  | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm            |
  | ========================================================            |
  |                                                                      |
  | There was a problem saving the "BBC News - World" bookmark to your  |
  | computer. Other data is not affected.                                |
  |                                                                      |
  | To help Mozilla improve Sync and prevent errors like this in the    |
  | future, please submit this report. Your personal data will not be    |
  | submitted.                                                          |
  |                                                                      |
  | [X] Automatically submit reports in the future.                      |
  |                                                                      |
  |                                                    [Submit report]  |
  |                                                                      |
  | > Full report                                                        |
  |                                                                      |
  ------------------------------------------------------------------------
 
* Pressing the ''Submit report'' button will submit the report. Once the report is submitted, a link to the report on the server is displayed:
 
  ------------------------------------------------------------------------
  | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm            |
  | ========================================================            |
  |                                                                      |
  | There was a problem saving the "BBC News - World" bookmark to your  |
  | computer. Other data is not affected.                                |
  |                                                                      |
  | Firefox submitted a report of the problem to Mozilla.                |
  |                                                                      |
  | [X] Automatically submit reports in the future.                      |
  |                                                                      |
  | > Full report                                                        |
  |                                                                      |
  ------------------------------------------------------------------------
 
* If the Syncorro server finds a suitable support page, the page will display:
 
  ------------------------------------------------------------------------
  | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm            |
  | ========================================================            |
  |                                                                      |
  | There was a problem saving the "BBC News - World" bookmark to your  |
  | computer. Other data is not affected.                                |
  |                                                                      |
  | Good news! Firefox submitted a report of the problem to Mozilla and  |
  | a possible solution was found. _View_support_page_                  |
  |                                                                      |
  | [X] Automatically submit reports in the future.                      |
  |                                                                      |
  | > Full report                                                        |
  |                                                                      |
  ------------------------------------------------------------------------
 
* Click on the arrow next to ''Full Report'' will show all information that potentially is or was submitted. Since there's typically a lot of it, it's divided into separate collapsible sections itself:
 
  ------------------------------------------------------------------------
  | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm            |
  | ========================================================            |
  |                                                                      |
  | There was a problem saving the "BBC News - World" bookmark to your  |
  | computer. Other data is not affected.                                |
  |                                                                      |
  | Firefox submitted a report of the problem to Mozilla.                |
  |                                                                      |
  | [X] Automatically submit reports in the future.                      |
  |                                                                      |
  | \/ Full report                                                      |
  |                                                                      |
  |    Report ID: {UUID}  [Copy to clipboard]                            |
  |                                                                      |
  |    > Application details                                            |
  |    > Sync account info                                              |
  |    > Error fingerprint                                              |
  |    > Log                                                            |
  |                                                                      |
  ------------------------------------------------------------------------
 
== Looking up error reports ==
 
* Basically make about:sync-log look like about:crashes, linking to the details pages as described in the previous section.
 
= Client Implementation =


Note: This is only a draft that is being fleshed out.
Note: This is only a draft that is being fleshed out.
Line 21: Line 129:
* Using Metric's Elastic Search system (also used for AMO stats and Socorro) at data.mozilla.org
* Using Metric's Elastic Search system (also used for AMO stats and Socorro) at data.mozilla.org
* On error, Sync POSTs a payload to data.mozilla.org:
* On error, Sync POSTs a payload to data.mozilla.org:
  POST /XXX HTTP/1.1
  Content-Type: application/json


   {
   {
Line 40: Line 151:
     },
     },
     error: {
     error: {
      localTimestamp: 13294938593,
       engine: "bookmarks",
       engine: "bookmarks",
       result: 489294595, // the error constant if applicable
       result: 489294595, // the error constant if applicable
Line 46: Line 158:
   }
   }


* Under normal conditions, the server returns
* Under normal conditions, the server returns HTTP 200 OK with optional hints for the client concerning throttling and help for the user:


   HTTP/1.1 200 OK
   HTTP/1.1 200 OK
   XXX
   Content-Type: application/json


* Server can return
  {
** XXX
    reportURL: "http://data.mozilla.org/syncorro/{UUID}",
** XXX
    throttle: 10,  // only submit every 10th error
    infoURL: "http://support.mozilla.com/..."  // optional support page
  }


== Discussion ==
* Server can also return other status codes to indicate that the data wasn't accepted.
** 500 Server Error
** XXX throttled, try again later
** XXX invalid data


* If the client fails to upload the report (e.g. because of network connectivity problems or similiar), it will retry periodically using a backoff strategy. After some number of failures, the upload is failed permanently, and no further retries will be attempted.
= Dashboard implementation =
* Graph of number of reports over time (potentially being able to split by certain metadata, e.g. product version, Sync node, etc.)
* Query by metadata
* Fulltext search over logs
* Define SUMO pages for percolator matches
TODO details (talk to ddash, jbalogh)


= Questions =
= Questions =


* Should this be
* Reports will probably have to be non-public for now, though it would be nice if users could view their own submitted reports... can we do some sort of token-based auth there?
** always on and not configurable, or
** ask after every error (with the option to remember the user's choice for future incidents),
** off by default and opt in, or
** on by default and opt out?
* Will this service require ToS changes?
* Will this service require ToS changes?
* What do we do with custom server users?
* What do we do with custom server users?
* What do we do when user has Trace logging enabled?
* What do we do when user has Trace logging enabled?
= Discussion =
== Tentatively identified as not in scope for v1 ==
* Ops paging/integration for events. A large spike in failures could be either a new client error or a server or operational issue, and that's info that we might want to leverage. Best to leave this until we know what we're doing.

Latest revision as of 20:12, 19 November 2013

Socorro + Sync = Syncorro \o/

People

  • Client engineering: Marina Samuel, Philipp von Weitershausen
  • Server engineering: XXX
  • Metrics: Daniel Einspanjer, Xavier Stevens
  • Product: Jennifer Arguello

Goals

  • Gather statistics on errors (to help with prioritization)
  • Be able to correlate errors with maintenance windows, user profiles, etc.
  • Simplify error reporting for users who file bugs or SUMO articles
  • Detect the "long tail" of problems that are never filed

Features

  • Each submitted report should be represented by a URL or at least an opaque token (e.g. UUID)
  • Ability to query according to application, Sync, and error specific metadata
  • Fulltext search over submitted log data
  • Ability to return instructions to client upon report submission (e.g. throttling, recovery, support messages for the user, etc.)

Roadmap

  • Discuss goals and features with metrics (DONE)
  • Discuss UI mockups with UX
  • Add ability to upload Syncorro data to ElasticSearch (see bug 673318)
  • Build add-on for the Services Beta Channel

Client UX

Submitting an error report

  • When there's a Sync error, the usual error bar is shown (except we're not showing the dreadful Unknown Error message):
 -------------------------------------------------------------------------
 | We're sorry, Sync encountered a problem [Details ...]              (X)|
 -------------------------------------------------------------------------
  • Clicking on the Details button dismisses the bar and brings up a tab with a high-level explanation of the details of the error:
 ------------------------------------------------------------------------
 | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm             |
 | ========================================================             |
 |                                                                      |
 | There was a problem saving the "BBC News - World" bookmark to your   |
 | computer. Other data is not affected.                                |
 |                                                                      |
 | To help Mozilla improve Sync and prevent errors like this in the     |
 | future, please submit this report. Your personal data will not be    |
 | submitted.                                                           |
 |                                                                      |
 | [X] Automatically submit reports in the future.                      |
 |                                                                      |
 |                                                     [Submit report]  |
 |                                                                      |
 | > Full report                                                        |
 |                                                                      |
 ------------------------------------------------------------------------
  • Pressing the Submit report button will submit the report. Once the report is submitted, a link to the report on the server is displayed:
 ------------------------------------------------------------------------
 | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm             |
 | ========================================================             |
 |                                                                      |
 | There was a problem saving the "BBC News - World" bookmark to your   |
 | computer. Other data is not affected.                                |
 |                                                                      |
 | Firefox submitted a report of the problem to Mozilla.                |
 |                                                                      |
 | [X] Automatically submit reports in the future.                      |
 |                                                                      |
 | > Full report                                                        |
 |                                                                      |
 ------------------------------------------------------------------------
  • If the Syncorro server finds a suitable support page, the page will display:
 ------------------------------------------------------------------------
 | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm             |
 | ========================================================             |
 |                                                                      |
 | There was a problem saving the "BBC News - World" bookmark to your   |
 | computer. Other data is not affected.                                |
 |                                                                      |
 | Good news! Firefox submitted a report of the problem to Mozilla and  |
 | a possible solution was found. _View_support_page_                   |
 |                                                                      |
 | [X] Automatically submit reports in the future.                      |
 |                                                                      |
 | > Full report                                                        |
 |                                                                      |
 ------------------------------------------------------------------------
  • Click on the arrow next to Full Report will show all information that potentially is or was submitted. Since there's typically a lot of it, it's divided into separate collapsible sections itself:
 ------------------------------------------------------------------------
 | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm             |
 | ========================================================             |
 |                                                                      |
 | There was a problem saving the "BBC News - World" bookmark to your   |
 | computer. Other data is not affected.                                |
 |                                                                      |
 | Firefox submitted a report of the problem to Mozilla.                |
 |                                                                      |
 | [X] Automatically submit reports in the future.                      |
 |                                                                      |
 | \/ Full report                                                       |
 |                                                                      |
 |    Report ID: {UUID}  [Copy to clipboard]                            |
 |                                                                      |
 |    > Application details                                             |
 |    > Sync account info                                               |
 |    > Error fingerprint                                               |
 |    > Log                                                             |
 |                                                                      |
 ------------------------------------------------------------------------

Looking up error reports

  • Basically make about:sync-log look like about:crashes, linking to the details pages as described in the previous section.

Client Implementation

Note: This is only a draft that is being fleshed out.

  • Using Metric's Elastic Search system (also used for AMO stats and Socorro) at data.mozilla.org
  • On error, Sync POSTs a payload to data.mozilla.org:
 POST /XXX HTTP/1.1
 Content-Type: application/json
 {
   id: "{UUID}",
   app: {
     product: "{UUID}",
     version: "8.0a1",
     buildID: "...",
     locale: "en_US",
     addons: ["{UUID}", "{UUID}", "{UUID}", ...]
   },
   sync: {
     version: "1.10",
     account: "eisklclxuauemrjghidis",
     cluster: "https://phx-sync091.services.mozilla.com/",
     engines: ["bookmarks", "history", ...],
     numClients: 2,
     mobileClients: true
   },
   error: {
     localTimestamp: 13294938593,
     engine: "bookmarks",
     result: 489294595, // the error constant if applicable
   },
   log: "..."
 }
  • Under normal conditions, the server returns HTTP 200 OK with optional hints for the client concerning throttling and help for the user:
 HTTP/1.1 200 OK
 Content-Type: application/json
 {
   reportURL: "http://data.mozilla.org/syncorro/{UUID}",
   throttle: 10,  // only submit every 10th error
   infoURL: "http://support.mozilla.com/..."  // optional support page
 }
  • Server can also return other status codes to indicate that the data wasn't accepted.
    • 500 Server Error
    • XXX throttled, try again later
    • XXX invalid data
  • If the client fails to upload the report (e.g. because of network connectivity problems or similiar), it will retry periodically using a backoff strategy. After some number of failures, the upload is failed permanently, and no further retries will be attempted.

Dashboard implementation

  • Graph of number of reports over time (potentially being able to split by certain metadata, e.g. product version, Sync node, etc.)
  • Query by metadata
  • Fulltext search over logs
  • Define SUMO pages for percolator matches

TODO details (talk to ddash, jbalogh)

Questions

  • Reports will probably have to be non-public for now, though it would be nice if users could view their own submitted reports... can we do some sort of token-based auth there?
  • Will this service require ToS changes?
  • What do we do with custom server users?
  • What do we do when user has Trace logging enabled?

Discussion

Tentatively identified as not in scope for v1

  • Ops paging/integration for events. A large spike in failures could be either a new client error or a server or operational issue, and that's info that we might want to leverage. Best to leave this until we know what we're doing.