CloudServices/Sync/FxSync/Syncorro: Difference between revisions

no edit summary
No edit summary
Line 3: Line 3:
= Goals =  
= Goals =  


* Simplify error reporting, especially for users who file bugs (no more telling them to enable logging, manually uploading the file, etc.)
* Gather statistics on errors (to help with prioritization)
* Get statistics on error frequency (to help with prioritization)
* Be able to correlate errors with maintenance windows, user profiles, etc.
* Simplify error reporting for users who file bugs or SUMO articles
* Detect the "long tail" of problems that are never filed
* Detect the "long tail" of problems that are never filed


= Proposal 1: a dedicated service =
= Features =


* If it encounters a non-trivial error, Sync uploads the log to the service via HTTPS POST.
* Each submitted report should be represented by a URL or at least an opaque token (e.g. UUID)
* Lines with TRACE are automatically removed as they often contain sensitive information (need to audit Sync's logging for this!)
* Ability to query according to application, Sync, and error specific metadata
* Fulltext search over submitted log data
* Ability to return instructions to client upon report submission (e.g. throttling, recovery, support messages for the user, etc.)


= Proposal 2: upload to Sync db =  
= Implementation =


Same as Proposal 1, but instead of uploading to a dedicated server, Sync would uploaded the logs to a special collection on the user's Sync node.
Note: This is only a draft that is being fleshed out.


== Comparing to proposal 1 ==
* Using Metric's Elastic Search system (also used for AMO stats and Socorro) at data.mozilla.org
* On error, Sync POSTs a payload to data.mozilla.org:


* Pro: No new server infrastructure required, scales as the Sync backend is scaled. (The client is pretty much oblivious to *where* the data is uploaded.)
  {
* Con: If the user's Sync node is experiencing problems, the error report may not be uploaded. So a failure in the Sync db would also inhibit the client from uploading the logs about it.
    id: "{UUID}",
* Con: We won't get any data from people with custom servers.
    app: {
* Con: Logging data normally contains data that's less confidential than the Sync data itself, but when it does it won't be encrypted. Thus it would make sense to have a separate service that has separate infrasec requirements.
      product: "{UUID}",
* Con: The URL to identify an error will point to the user's node and will thus require the user's credentials to access: impractical
      version: "8.0a1",
      buildID: "...",
      locale: "en_US",
      addons: ["{UUID}", "{UUID}", "{UUID}", ...]
    },
    sync: {
      version: "1.10",
      account: "eisklclxuauemrjghidis",
      cluster: "https://phx-sync091.services.mozilla.com/",
      engines: ["bookmarks", "history", ...],
      numClients: 2,
      mobileClients: true
    },
    error: {
      engine: "bookmarks",
      result: 489294595, // the error constant if applicable
    },
    log: "..."
  }


= Proposal 3 =
* Under normal conditions, the server returns


Use the planned [https://wiki.mozilla.org/Platform/Features/Telemetry Telemetry] service.
  HTTP/1.1 200 OK
  XXX
 
* Server can return
** XXX
** XXX


== Discussion ==
== Discussion ==


* Is Telemetry designed for this? Does it have an API for this (either client or server side)?
* How will users, Sync developers, and Services Ops have access to the data? Uploaded records should be accessible via URL so they can be linked to bug reports.


= Questions =
= Questions =
Line 41: Line 66:
** on by default and opt out?
** on by default and opt out?
* Will this service require ToS changes?
* Will this service require ToS changes?
* What do we do with custom server users?
* What do we do when user has Trace logging enabled?
canmove, Confirmed users
725

edits