CloudServices/Sync/FxSync/StoreRedesign: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
Line 3: Line 3:
== Proposal ==
== Proposal ==


Reduce to two classes: ''Repository'' and ''Synchronizer'':
Reduce to two main classes: ''Repository'' and ''Synchronizer'':
* ''Repositories'' are both sinks and sources of records
* ''Repositories'' are both sinks and sources of records
* ''Synchronizers'' exist (from one perspective, at least) to reify the relationship between two Repositories (including tracking <tt>lastSync</tt>), which in practice means connecting a pair of repositories as source then sink in turn.
* ''Synchronizers'' exist (from one perspective, at least) to reify the relationship between two Repositories (including tracking <tt>lastSync</tt>), which in practice means connecting a pair of repositories as source then sink in turn.


Repositories are entirely responsible for providing a timestamp- and record-centric API over a source of data. This interface abstracts the tracking of changed items and application and retrieval of records, and is uniform across both remote and local storage. For example, we would build a ''FxBookmarkRepository'' layer above Places, a ''ServerRepository'' layer in front of the v5/v1.1 Sync API, and connect the two with a simple Synchronizer. Both Repository implementations would implement ''exactly the same interface'', which would allow us to trivially implement:
These actions actually take place via ''RepositorySession''s and ''SynchronizerSession''s, which represent single sync events.
 
Repository(Session)s are entirely responsible for providing a timestamp- and record-centric API over a source of data. This interface abstracts the tracking of changed items and application and retrieval of records, and is uniform across both remote and local storage. For example, we would build a ''FxBookmarkRepository'' layer above Places, a ''ServerRepository'' layer in front of the v5/v1.1 Sync API, and connect the two with a simple Synchronizer. Both Repository implementations would implement ''exactly the same interface'', which would allow us to trivially implement:


* Direct sync between two devices, without a server intermediary
* Direct sync between two devices, without a server intermediary
Line 23: Line 25:
== API ==
== API ==


The API for Repository is defined in terms of callbacks, each of which can be called multiple times (e.g., as batches of records arrive), and iterable sequences. A callback invocation takes one or more errors, and optionally one or more records, as input. Each invocation can be provided with a <tt>DONE</tt> constant to indicate completion, and can return a <tt>STOP</tt> constant to (optionally) prevent further cycles.
The API for Repository/RepositorySession is defined in terms of callbacks, each of which can be called multiple times (e.g., as batches of records arrive). A callback invocation takes an error argument, and optionally one or more records, as input. Each invocation can be provided with a <tt>DONE</tt> constant to indicate completion. Callbacks can invoke an <tt>abort</tt> method on the session to (optionally) prevent further cycles.


The Repository API consists of three methods and two callbacks:
Classes are (links will rot):


=== Callbacks ===
* [https://github.com/rnewman/alder/blob/repositories/services/sync/modules/repository.js#L54 Repository]
** [https://github.com/rnewman/alder/blob/repositories/services/sync/modules/repository.js#L223 Server11Repository]
** [https://github.com/rnewman/alder/blob/repositories/services/sync/modules/repository.js#L552 Crypto5Middleware]


* fetchCallback(errs, recs):
''createSession'' returns (via a callback) a ''RepositorySession'':
** Invoke with <tt>errs</tt> = <tt>null</tt>, an error, or an iterable of errors. Each error has a <tt>.guid</tt> attribute. <tt>recs</tt> = <tt>null</tt>, a record, or an iterable of records. <tt>errs</tt> and <tt>recs</tt> cannot both be provided.
** Invoked one or more times as required.
** Return <tt>STOP</tt> if an error has occurred which means that no further records should be provided.
** Pass <tt>DONE</tt> as the <tt>err</tt> input on completion.


* guidsCallback(errs, guids):
* [https://github.com/rnewman/alder/blob/repositories/services/sync/modules/repository.js#L93 RepositorySession]
** Invoke exactly once with <tt>errs</tt> = <tt>null</tt>, an error, or an iterable of errors; <tt>guids</tt> = <tt>null</tt> or an array of GUIDs.
** [https://github.com/rnewman/alder/blob/repositories/services/sync/modules/repository.js#L264 Server11Session]
** [https://github.com/rnewman/alder/blob/repositories/services/sync/modules/repository.js#L638 Crypto5StoreSession]
** [https://github.com/rnewman/alder/blob/repositories/services/sync/modules/repository.js#L733 TrackingSession]


* storeCallback(errs):
Synchronizer holds two Repositories, creating sessions appropriately.
** Invoke with <tt>errs</tt> = <tt>null</tt>, an error, or an iterable of errors. Each error has a <tt>.guid</tt> attribute.


=== Methods ===
Synchronizer classes are:


* fetch(guids, fetchCallback):
* [https://github.com/rnewman/alder/blob/repositories/services/sync/modules/synchronizer.js#L46 SynchronizerSession]
** Retrieve a sequence of records by GUID. <tt>guids</tt> should be an iterable. Invoke the callback, as with <tt>fetchSince</tt>.
* [https://github.com/rnewman/alder/blob/repositories/services/sync/modules/synchronizer.js#L271 Synchronizer]


* fetchSince(timestamp, fetchCallback):
** Retrieve a sequence of records that have been modified since <tt>timestamp</tt>. Invoke the callback with one or more records each time, or <tt>DONE</tt> when complete.


* guidsSince: function guidsSince(timestamp, guidsCallback) {
** Retrieve a sequence of GUIDs corresponding to records that have been modified since <tt>timestamp</tt>.


* store(recs, storeCallback):
== Questions and considerations ==
** <tt>rex</tt>: as for <tt>fetchCallback</tt>. The output of <tt>fetchSince</tt> is likely piped into this method.
** Optionally returns <tt>STOP</tt>. (Return rather than callback to allow synchronous handoff to <tt>fetchSince</tt>.)


=== Notifications ===
* Q: Where does batching happen? A: Within each RepositorySession implementation.
 
???
 
== Questions and considerations ==


* Where does batching happen?
* What are the patterns around multiple invocation of callbacks?
* How much can we unify/parameterize stores? I can easily imagine a single V5HTTPStore class that is simply constructed with a collection URL…
* Does the callback/return combination work in practice, given that we'll be chaining callbacks and invoking async methods at each end?
* As much storage as possible should be pushed down into the data store. That should allow these classes to be effectively stateless; the two query inputs are modified time and GUID. Tracking should be reduced to deleted items, and even that can be elided.
* As much storage as possible should be pushed down into the data store. That should allow these classes to be effectively stateless; the two query inputs are modified time and GUID. Tracking should be reduced to deleted items, and even that can be elided.
* I'm very intrigued by deltas. It should be possible to stage rollout through an appropriate wrapper store.
* I'm very intrigued by deltas. It should be possible to stage rollout through an appropriate wrapper store.
canmove, Confirmed users
640

edits

Navigation menu