BMO/ChangeNotificationSystem: Difference between revisions

From MozillaWiki
< BMO
Jump to navigation Jump to search
Line 22: Line 22:
= Design and Approach =
= Design and Approach =


We will implement a separate server that polls the database directly (or otherwise receives notifications from the database) for changes to bugs and passes on *only the ID* of the changed bug to its clients. The clients can then use the main Bugzilla REST API to determine the exact changes, which will enforce permissions as usual.
We will implement a separate server that polls the database directly for changes to bugs and passes on *only the ID and change time* of the changed bug to its clients. The clients can then use the main Bugzilla REST API to determine the exact changes, which will enforce permissions as usual.


To be determined is whether all updates are pushed out or whether one can subscribe to particular bugs.  For simplicity, we might want to notify clients of ''any'' changed bug; there is no support for subscribing to particular bugs. The notification data is very small (a bug ID, currently 6 or fewer characters); however, if bugs are changing very rapidly, this might still generate an undesirable amount of traffic.  In this case, we should implement a simple subscription command, although to make this useful, probably some other related commands would be required (unsubscribe, list subscriptions).  Some benchmarking will be needed to make this decision.
Clients will be able to subscribe to one or more bugs, and will be able to get a list of their subscriptions and unsubscribe from them. Subscriptions will only be valid for the current session; if a client disconnects, they will have to resubscribe to their bugs.


'''Important note''': this whole design relies upon the idea that ''knowing the ID of a changed bug'' is not a security risk. This should be reasonable, given that the only information that is conveyed is that some bug has changed. One could use this information to determine the frequency of changes to a particular bug over some time frame, and hence ''perhaps'' an increased interest in a particular bug, but the changes could be to anything--main bug fields, comments, tracking flags, dependencies, etc.
'''Important note''': this whole design relies upon the idea that ''knowing the ID of a changed bug'' is not a security risk. This should be reasonable, given that the only information that is conveyed is that some bug has changed. One could use this information to determine the frequency of changes to a particular bug over some time frame, and hence ''perhaps'' an increased interest in a particular bug, but the changes could be to anything--main bug fields, comments, tracking flags, dependencies, etc.
Line 30: Line 30:
Conceptually, there are three parts:
Conceptually, there are three parts:


* Database poller/listener. There would be exactly '''one''' process that frequently polls the database for changes (period TBD but on the order of seconds, not minutes). This would keep the time of the last poll in memory and would ask for '''only''' the ID of all bugs changed since the time of the last poll.
* Database poller. There would be exactly '''one''' process that frequently polls the database for changes (period TBD but on the order of seconds, not minutes). This would keep the time of the last poll in memory and would ask for '''only''' the ID of all bugs changed since the time of the last poll.
** Even better would be some sort of push notification from the database itself, if this is possible.
** We will use a separate table to hold the notifications, which will be written by a Bugzilla extension called PushNotify. The poller will read entries from this table and publish messages on pulse.


* TCP servers. There would be one or more processes acting as servers that accept client connections and maintain them indefinitely. WebSockets is the preferred protocol for easy integration with browsers. These servers would listen for notifications from the database poller and fan out notification messages to all clients. For scalability, multiple server processes could be launched with a load-balancer (such as Zeus) spreading out connections amongst them.
* TCP servers. There would be one or more processes acting as servers that accept client connections and maintain them indefinitely. WebSockets is the preferred protocol for easy integration with browsers. These servers would listen for notifications from the database poller and fan out notification messages to all clients according to subscriptions. For scalability, multiple server processes could be launched with a load-balancer (such as Zeus) spreading out connections amongst them. Note that the server will receive notifications for *all* bugs, since we don't expect there to be many servers. It will be up to the server to implement a subscription protocol and keep track of them.


* Messaging middleware. Some sort of connection will be required between the database poller and the TCP servers. This could be as simple as standard POSIX communication channels (e.g. named pipes) or a larger application such as an AMQP server (e.g. RabbitMQ or ZeroMQ), as needed. This is the main open question at the moment.
* Messaging middleware. We are using [http://pulse.mozilla.org Pulse] for a couple reasons: it's already set up and easily extended, and we can eventually support a variety of transports/servers.


= Implementation =
= Implementation =
Work on the prototype is being tracked in {{bug|923849}}.
Work on the prototype is being tracked in {{bug|923849}}.

Revision as of 22:30, 11 November 2013

Team

BMO team (dkl, glob, mcote), ebryn (contract developer on front end), peterbe (advisor from web engineering)

Problem

With a focus on Bugzilla as a platform, facilitating responsive, JavaScript-based front ends, detecting changes to bugs in a timely fashion is increasingly important. At the moment, the only way to determine if a bug has been recently updated is to poll; there is no push mechanism of any kind. This model has inherent problems, including but not limited to scalability (opening and closing connections is costly) and performance (polling must go through Bugzilla's permission system and other logic layers).

Goals & Considerations

Provide a push-based notification system to inform clients of changes to bugs. Plan for scalability by minimizing server load and time from change to notification.

Providing details of what has changed is not necessary in the notification itself, although preferably this information would be available somehow, perhaps in a separate call.

Non-Goals

Re-implementing Bugzilla's permissions system is not an option. It is complex enough that changing the current model would be major surgery (and would further diverge BMO from upstream), and maintaining two parallel implementations would incur maintenance costs and be error prone.

Notifications should not include changes to dependent/related bugs. This is harder to track based on the current Bugzilla database schema, and a properly designed system should be able to track them indepedently at the client's discretion.

Design and Approach

We will implement a separate server that polls the database directly for changes to bugs and passes on *only the ID and change time* of the changed bug to its clients. The clients can then use the main Bugzilla REST API to determine the exact changes, which will enforce permissions as usual.

Clients will be able to subscribe to one or more bugs, and will be able to get a list of their subscriptions and unsubscribe from them. Subscriptions will only be valid for the current session; if a client disconnects, they will have to resubscribe to their bugs.

Important note: this whole design relies upon the idea that knowing the ID of a changed bug is not a security risk. This should be reasonable, given that the only information that is conveyed is that some bug has changed. One could use this information to determine the frequency of changes to a particular bug over some time frame, and hence perhaps an increased interest in a particular bug, but the changes could be to anything--main bug fields, comments, tracking flags, dependencies, etc.

Conceptually, there are three parts:

  • Database poller. There would be exactly one process that frequently polls the database for changes (period TBD but on the order of seconds, not minutes). This would keep the time of the last poll in memory and would ask for only the ID of all bugs changed since the time of the last poll.
    • We will use a separate table to hold the notifications, which will be written by a Bugzilla extension called PushNotify. The poller will read entries from this table and publish messages on pulse.
  • TCP servers. There would be one or more processes acting as servers that accept client connections and maintain them indefinitely. WebSockets is the preferred protocol for easy integration with browsers. These servers would listen for notifications from the database poller and fan out notification messages to all clients according to subscriptions. For scalability, multiple server processes could be launched with a load-balancer (such as Zeus) spreading out connections amongst them. Note that the server will receive notifications for *all* bugs, since we don't expect there to be many servers. It will be up to the server to implement a subscription protocol and keep track of them.
  • Messaging middleware. We are using Pulse for a couple reasons: it's already set up and easily extended, and we can eventually support a variety of transports/servers.

Implementation

Work on the prototype is being tracked in bug 923849.