Privacy/Reviews/Firefox Home: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
 
(29 intermediate revisions by 2 users not shown)
Line 4: Line 4:
|'''Feature/Product:''' || Firefox Home
|'''Feature/Product:''' || Firefox Home
|-
|-
|'''Projected Feature Freeze Date:''' || <section begin='eta' />(tbd)<section end='eta' />
|'''Projected Feature Freeze Date:''' || <section begin='eta' /> Cancelled <section end='eta' />
|-
|-
|'''Product Champions:''' || (your name here)
|'''Product Champions:''' || -
|-
|-
|'''Privacy Champions:''' || [[User:Sidstamm|Sid Stamm]]
|'''Privacy Champions:''' || [[User:Sidstamm|Sid Stamm]]
Line 12: Line 12:
|'''Security Contact:''' || Michael Coates
|'''Security Contact:''' || Michael Coates
|-
|-
|'''Document State:''' || <section begin='status'/>{{new|}}<section end='status'/>
|'''Document State:''' || <section begin='status'/>{{Resolved|Obsolete -- project with this design dropped}}<section end='status'/>
|}
|}


Timeline:
Timeline:


{|
{|
|'''Architectural Overview:''' || 27-April-2011
|'''Architectural Overview:''' || 27-April-2011 (crypto proxy) TBD (home server)
|-
|-
|'''Recommendation Meeting:''' || (date TBD)
|'''Recommendation Meeting:''' || cancelled
|-
|-
|'''Wrap-up Meeting:''' || (if necessary)
|'''Wrap-up Meeting:''' || cancelled
|}
|}


Line 30: Line 31:


'''The main objective of this feature/product is:''' (describe the goals of the feature/product here)
'''The main objective of this feature/product is:''' (describe the goals of the feature/product here)
* [[Home/Roadmap]]


'''Design Documents''':  
'''Design Documents''':  
Link to any design or architectural documents here.
Link to any design or architectural documents here.
* [[Firefox_Home_Options|Pros and Cons of different architectures]]
* [[Firefox_Home_Components]]
* [[Firefox_Home_Server_Architecture]]


'''Feature Pages''':
'''Feature Pages''':
[[Home/Features/crypto/proxy]]
* [[Home/Features/crypto/proxy]]


== Components  ==
== Components  ==
Line 41: Line 48:
Describe any major components in the system and how they interact.  Also include any third-party APIs (those Mozilla does not control) and what type of data is sent or received via those APIs.
Describe any major components in the system and how they interact.  Also include any third-party APIs (those Mozilla does not control) and what type of data is sent or received via those APIs.


=== Component X  ===
[[File:HomeDigraph.png]]


This component does A, B and C and interacts with component Y to do D.
=== Crypto Proxy  ===
 
This component connects to your sync account and acts (as a sync client) as a proxy to decrypt your data.
[[Home/Features/crypto/proxy]]


The tables below simply summarize the data encountered by this component.  
The tables below simply summarize the data encountered by this component.  
Line 54: Line 64:
! Where
! Where
|-
|-
| data type
| usernames + sync auth tokens (for accessing users' data)
| where stored
| server's db?
|}
 
 
'''Communication with [[Firefox_Sync|Sync Client (Firefox)]]'''
{| class="wikitable"
|-
! Direction
! Message
! Data
! Notes
|-
| ''In:''
| createAccount()
| username
| Called by sync client when users elect to enable web access
|-
| ''Out:''
| createAccount() return
| access token
| token for obtaining user's key for tab/bookmark/history collections sent to sync client (given to home)
|}
 
 
'''Communication with [[Services/Sync|Sync Server]]'''
{| class="wikitable"
|-
! Direction
! Message
! Data
! Notes
|-
| ''In:''
| sync() return
| encrypted tabs/bookmarks/history
| Called to get access to user's sync data
|-
| ''Out:''
| sync() call
| access token + username
| Called to obtain access to encrypted data (which will be decrypted and sent to Home Server)
|}
 
 
'''Communication with Home Server'''
{| class="wikitable"
|-
! Direction
! Message
! Data
! Notes
|-
| ''In:''
| sync() call
| username + access token
| called by home to obtain user's sync data
|-
| ''Out:''
| sync() return
| decrypted data
| user's unencrypted sync data
|}
|}


'''Communication with Component Y'''  
=== Home Web Servers ===
 
We will have stateless web servers that run the Home web application. These are standard web servers running Apache or NGINX to serve the Home web application.
 
These servers will likely be load balanced by Zeus.
 
These servers are supposed to be stateless so no data will stored on these servers. However, they might have sensitive configuration settings stored on them. For example things like web service keys or tokens that we need to connect to third party services. These are not user specific but instead are for the Home application.
 
(These external services have not been identified yet, but think about services like bit.ly.)
 
The tables below simply summarize the data encountered by this component.
 
'''Stored Data:'''
 
None. Except probably configuration data.
 
'''Communication with MemCache Server'''  


{| class="wikitable"
{| class="wikitable"
Line 68: Line 154:
|-
|-
| ''In:''  
| ''In:''  
| message 1
| Get Web Session
| types of data received from component Y with the message
| The Web Session object.
|  
|  
|-
|-
| ''Out:''  
| ''Out:''  
| message 2
| Put Web Session
| types of data sent to component Y with the message
| The Web Session object.
|
|}
 
'''Communication with Home Database Servers'''
 
{| class="wikitable"
|-
! Direction
! Message
! Data
! Notes
|-
| ''Select''
| Get the user's (summarized) sync data
| -|
|-
| ''Out:''
| Insert/Update User's Web App Settings/Prefs
| -
|
|}
 
=== Home Database Servers ===
 
User data will sharded over a number of database servers. We will use a simple hashing mechanism so that we can determine where a user's data lives based on for example their username.
 
Each database will contain a plaintext version of the user's sync data. Initially that means bookmarks, history and tabs. The data will be normalized and properly indexed a bit more than it currently is in the Sync Servers so that it is easier to query for things.
 
All data for all users will be stored in a single database. This means that all records have a unique username or userid field to connect them to a specific user. Queries will have to be properly constructed to follow this.
 
(We can probably also switch to one database per user which will mean that there a more logical separation between user's data. However that does not rule out bugs in the front-end code to expose other user's data of course.)
 
One thing we will probably do is run some queries offline. For example we can periodically 'calculate' a list of your top sites and store that in a database table too.
 
The tables below simply summarize the data encountered by this component.
 
'''Stored Data:'''
 
{| class="wikitable"
|-
! What
! Where
|-
| User's Bookmarks (Sync Data)
| MySQL Database
|-
| User's History (Sync Data)
| MySQL Database
|-
| User's Tabs (Sync Data)
| MySQL Database
|-
| User specific settings/prefs for Firefox Home
| MySQL Database
|-
| User access token for the Crypto Proxy
| MySQL Database
|}
 
'''Communication with other components or services'''
 
The database servers will periodically run a job to schedule a Sync operation for those users that are active users of Firefox Home. These jobs are submitted to a RabbitMQ server and picked up by the 'Syncer' component. These tasks only contain the username to be synced.
 
(Idea: Many users try out a new service and then forget about it. We could proactively delete user's data when they do not use Firefox Home for a certain period of time. Note that in the first couple of releases of Home there will not be any user generated data, just a copy of your existing Sync Data. So this is less scary than it sounds.)
 
=== Home Memcache Servers ===
 
The memcache servers are used to cache frequently used data to make the web app as responsive as possible. Initially just Web Application session objects are stored in memcache. These sessions are  Python objects that contain user specific cached data.
 
(Not sure what will actually be in there. Possibly fragments of JSON or HTML or lists of things that we generate from your bokomarks & history)
 
'''Stored Data:'''
 
{| class="wikitable"
|-
! What
! Where
|-
| Web Application Session
| MemCache Server
|}
 
=== Home Syncer ===
 
The 'Syncer' is a component that implements a sync client. It listens to a RabbitMQ queue to grab sync tasks and runs sync sessions.
 
The task that it gets from RabbitMQ contain just the username. This means that the Syncer will have to access the Home Database Servers to obtain the access token for the Crypto Proxy.
 
It will then run a sync session for the specific user against the Sync Proxy and store the synced data (bookmarks, tabs, history) in the Home Database Servers,
 
'''Stored Data:'''
 
None
 
'''Communication with Home Database Servers'''
 
{| class="wikitable"
|-
! Direction
! Message
! Data
! Notes
|-
| ''Select''
| Get User's Proxy Access Token
| (Access Token)
|
|-
| ''Insert/Update''
| Update the Sync Data
| (Bookmarks, History, Tabs)
|  
|  
|}
'''Communication with Crypto Proxy'''
{| class="wikitable"
|-
! Direction
! Message
! Data
! Notes
|-
| ''In:''
| sync() return
| unencrypted tabs/bookmarks/history
| Called to get access to user's sync data
|-
| ''Out:''
| sync() call
| access token + username
| Called to obtain access to sync data
|}
|}


Line 116: Line 332:
! Details
! Details
|-
|-
| {{new|Initial Overview Discussion}}
| {{done|Initial Overview Discussion}}
| ?
| Stuart, rnewman, Stefan, Sid, Alex, secteam, infrasec
|
| Meeting: 26-April-2011
|-
| {{ok|Finish documenting system, produce recommendations}}
| Sid, Home Team, Privacy
|
| In progress
|-
| {{new|Discuss privacy recommendations}}
| Home team + Privacy
|  
|  
| Meeting time TBD
| Meeting time TBD
Line 123: Line 349:




[[Category:Privacy/Reviews|Template]]
[[Category:Privacy/Reviews|Home]]

Latest revision as of 22:02, 12 July 2011

Document Overview

Feature/Product: Firefox Home
Projected Feature Freeze Date: Cancelled
Product Champions: -
Privacy Champions: Sid Stamm
Security Contact: Michael Coates
Document State: [RESOLVED] Obsolete -- project with this design dropped


Timeline:

Architectural Overview: 27-April-2011 (crypto proxy) TBD (home server)
Recommendation Meeting: cancelled
Wrap-up Meeting: cancelled

Architecture

In this section, the product's architecture is described. Any individual components or actors are identified, their "knowledge" or what data they store is identified, and data flow between components and external entities is described.

The main objective of this feature/product is: (describe the goals of the feature/product here)

Design Documents: Link to any design or architectural documents here.

Feature Pages:

Components

Describe any major components in the system and how they interact. Also include any third-party APIs (those Mozilla does not control) and what type of data is sent or received via those APIs.

HomeDigraph.png

Crypto Proxy

This component connects to your sync account and acts (as a sync client) as a proxy to decrypt your data. Home/Features/crypto/proxy

The tables below simply summarize the data encountered by this component.

Stored Data:

What Where
usernames + sync auth tokens (for accessing users' data) server's db?


Communication with Sync Client (Firefox)

Direction Message Data Notes
In: createAccount() username Called by sync client when users elect to enable web access
Out: createAccount() return access token token for obtaining user's key for tab/bookmark/history collections sent to sync client (given to home)


Communication with Sync Server

Direction Message Data Notes
In: sync() return encrypted tabs/bookmarks/history Called to get access to user's sync data
Out: sync() call access token + username Called to obtain access to encrypted data (which will be decrypted and sent to Home Server)


Communication with Home Server

Direction Message Data Notes
In: sync() call username + access token called by home to obtain user's sync data
Out: sync() return decrypted data user's unencrypted sync data

Home Web Servers

We will have stateless web servers that run the Home web application. These are standard web servers running Apache or NGINX to serve the Home web application.

These servers will likely be load balanced by Zeus.

These servers are supposed to be stateless so no data will stored on these servers. However, they might have sensitive configuration settings stored on them. For example things like web service keys or tokens that we need to connect to third party services. These are not user specific but instead are for the Home application.

(These external services have not been identified yet, but think about services like bit.ly.)

The tables below simply summarize the data encountered by this component.

Stored Data:

None. Except probably configuration data.

Communication with MemCache Server

Direction Message Data Notes
In: Get Web Session The Web Session object.
Out: Put Web Session The Web Session object.

Communication with Home Database Servers

Direction Message Data Notes
Select Get the user's (summarized) sync data
Out: Insert/Update User's Web App Settings/Prefs -

Home Database Servers

User data will sharded over a number of database servers. We will use a simple hashing mechanism so that we can determine where a user's data lives based on for example their username.

Each database will contain a plaintext version of the user's sync data. Initially that means bookmarks, history and tabs. The data will be normalized and properly indexed a bit more than it currently is in the Sync Servers so that it is easier to query for things.

All data for all users will be stored in a single database. This means that all records have a unique username or userid field to connect them to a specific user. Queries will have to be properly constructed to follow this.

(We can probably also switch to one database per user which will mean that there a more logical separation between user's data. However that does not rule out bugs in the front-end code to expose other user's data of course.)

One thing we will probably do is run some queries offline. For example we can periodically 'calculate' a list of your top sites and store that in a database table too.

The tables below simply summarize the data encountered by this component.

Stored Data:

What Where
User's Bookmarks (Sync Data) MySQL Database
User's History (Sync Data) MySQL Database
User's Tabs (Sync Data) MySQL Database
User specific settings/prefs for Firefox Home MySQL Database
User access token for the Crypto Proxy MySQL Database

Communication with other components or services

The database servers will periodically run a job to schedule a Sync operation for those users that are active users of Firefox Home. These jobs are submitted to a RabbitMQ server and picked up by the 'Syncer' component. These tasks only contain the username to be synced.

(Idea: Many users try out a new service and then forget about it. We could proactively delete user's data when they do not use Firefox Home for a certain period of time. Note that in the first couple of releases of Home there will not be any user generated data, just a copy of your existing Sync Data. So this is less scary than it sounds.)

Home Memcache Servers

The memcache servers are used to cache frequently used data to make the web app as responsive as possible. Initially just Web Application session objects are stored in memcache. These sessions are Python objects that contain user specific cached data.

(Not sure what will actually be in there. Possibly fragments of JSON or HTML or lists of things that we generate from your bokomarks & history)

Stored Data:

What Where
Web Application Session MemCache Server

Home Syncer

The 'Syncer' is a component that implements a sync client. It listens to a RabbitMQ queue to grab sync tasks and runs sync sessions.

The task that it gets from RabbitMQ contain just the username. This means that the Syncer will have to access the Home Database Servers to obtain the access token for the Crypto Proxy.

It will then run a sync session for the specific user against the Sync Proxy and store the synced data (bookmarks, tabs, history) in the Home Database Servers,

Stored Data:

None

Communication with Home Database Servers

Direction Message Data Notes
Select Get User's Proxy Access Token (Access Token)
Insert/Update Update the Sync Data (Bookmarks, History, Tabs)

Communication with Crypto Proxy

Direction Message Data Notes
In: sync() return unencrypted tabs/bookmarks/history Called to get access to user's sync data
Out: sync() call access token + username Called to obtain access to sync data

User Data Risk Minimization

In this section, the privacy champion will identify areas of user data risk and recommendations for minimizing the risk.

Alignment with Privacy Operating Principles

In this section, the privacy champion will identify how the feature lines up with Mozilla's privacy operating principles.

See Also: Privacy/Roadmap_2011#Operating_Principles:

Principle: Transparency / No Surprises: (How the feature addresses this)

Recommendations: (what can be improved)


Principle: Real Choice:

Recommendations:


Principle: Sensible Defaults:

Recommendations:


Principle: Limited Data:

Recommendations:


Follow-up Tasks and tracking

What Who Bug Details
[DONE] Initial Overview Discussion Stuart, rnewman, Stefan, Sid, Alex, secteam, infrasec Meeting: 26-April-2011
[ON TRACK] Finish documenting system, produce recommendations Sid, Home Team, Privacy In progress
[NEW] Discuss privacy recommendations Home team + Privacy Meeting time TBD