Services/MessageQueuing: Difference between revisions

Latest revision as of 19:45, 24 January 2012

Redirect to:

Services/Sagrada/Queuey

@@ Line 1: / Line 1: @@
-= Overview =
+#REDIRECT[[Services/Sagrada/Queuey]]
-The '''Message Queue''' project is part of [[Services/Sagrada|Project Sagrada]]
-, providing a service for applications to queue messages for clients.
-= Project =
-== Engineers ==
-* Ben Bangert
-== User Requirements ==
-=== Phase 1 ===
-'''NOTE: This phase is being somewhat skipped at the moment to Phase 2'''
-The first version of the MQ will focus on providing a useful service for the
-[[Services/Notifications|Notifications project]] to enqueue messages for many
-queues and for clients to poll for messages on a given queue.
-No message aggregation is done in this version, if a client needs to get
-messages for more than one subscription it must poll each queue and aggregate
-them.
-The first version could be considered a '''Message Store''' rather than a
-''queue'' as it supports a much richer set of query semantics and does not let
-public consumers remove messages. Messages can be removed via the entire queue
-being deleted by the App or by expiring.
-Requirements:
-* Service App can create queues
-* Service App can add messages to queue
-* Messages on queues expire
-* Clients may read any queue they are aware of
-=== Phase 2 ===
-The second version allows authenticated applications to queue messages and for
-clients to consume them. This model allows for a worker model where jobs are
-added to a queue that multiple clients may be watching, and each message
-will be given to an individual client.
-Optionally, the client can 'reserve' a message to indicate it would like it,
-and if the client does not verify that it has successfully handled the message
-it will be put back in the queue or to a fail/retry queue if desired.
-Requirements:
-* Service App can create queues
-* Service App can add messages to queue
-* Authenticated clients may consume messages from a queue.
-* Authenticated clients may mark a message for consumption with a reservation TTL.
-== Architecture ==
-For scalability purposes, since some messages may be retained for periods of time
-per Phase 1 requirements, the initial choice of a backend is Cassandra. However,
-for Phase 2 requirements, Cassandra is missing the ability to manage and coordinate
-queue consumers so Zookeeper is being used for distributed synchronization.
-When used for Notifications, each user will have a single queue per Notification
-Application, this helps ensure that even for an individual user receiving many
-messages, they are partitioned by the Notification Application to ensure
-a more level distribution rather than a single deep queue.
-Under the Phase 2 use-case, it is most likely desired that massive amounts of
-messages may be intended for the same queue, which would normally result in a
-single extremely deep queue. Deep queue's do not scale well horizontally,
-especially when using Cassandra which maximizes through-put across many
-row-keys.
-The other issue is that to consume messages, there are only two effective ways
-of marking a message as consumed:
-. The message may be deleted after it has been sent. The only problem in
-this case is that there is still no guarantee it has been processed, and
-acquiring a lock to consume a message, and the resulting write-back to
-delete it is expensive.
-. One consumer per queue/partition. This requires some guess-work up-front
-about how many consumers will be around. Since consumers can consume from
-multiple queue/partition's at once, there needs to be at least as many
-partitions for a queue, as desired consumers. The advantage with this
-approach is that locking is only necessary when adding/removing consumers
-to ensure one consumer per partition.
-This MessageQueue project goes with the second option, and only incurs the
-lock during consumer addition/removal. The MessageQueue also does not track
-the state or last message read in the queue/partition's, it is the consumers
-responsibility to track how far it has read and processed successfully. There
-is an API call available to record with the MessageQueue how far a consumer
-has successfully processed in a queue/partition.
-When reading messages for a processing workload, they should be read in batches
-for performance to avoid network latency overhead.
-Since messages are stored and read by timestamp from Cassandra, and Cassandra
-only has eventual consistency, there is an enforced delay in how soon a message
-is availalable for reading to ensure a consistent picture of the queue. This
-is no less than 5 seconds after insertion, and at most about 15 seconds.
-When using queue's that are to be consumed, they must be declared up-front as
-a ''partioned'' queue. The amount of partitions should also be specified, and
-new messages will be randomly partioned. If messages should be processed in
-order, they can be inserted into a single partition to enforce ordering. All
-messages that are randomly partitioned should be considered loosely ordered.
-== Proposed API (Phase 1) ==
-For the first version, applications allowed to use the '''Message Queue'''
-will be given an application key that must be sent with every request, and
-their IP must be on an accepted IP list for the given application key.
-The application key must be sent as a HTTP header named 'ApplicationKey'.
-=== Internal Apps ===
-'''POST /queue'''
-    Creates a new queue, optionally takes the queue name as POST param ''queue_name''.
-'''DELETE /queue/{queue_name}'''
-    Deletes a given queue created by the App. When the param ''delete'' is set to ''false'', then
-    the queue contents will be deleted, but the queue will remain registered.
-'''POST /queue/{queue_name}'''
-    Create a message on the given queue. Contents is expected to be
-    a JSON object.
-    Raises an error if the queue does not exist.
-=== Clients ===
-'''GET /queue/{queue_name}'''
-    Returns all messages for the queue. Can optionally be passed
-    one of several query parameters:
-        ''since_timestamp'' - All messages newer than this timestamp, should be formatted as seconds since epoch in GMT
-        ''limit''           - Only return N amount of messages
-    Messages are returned in order of newest to oldest.
-== Use Cases ==
-=== Notifications ===
-The new version of [[Services/Notifications/Push|Notifications]] will use the
-MQ to store messages for users from various websites and allow users to poll
-the MQ directly for new notifications they have allowed.
-* Internal app add's messages to a queue by queue name
-* Internal app can delete queues by name
-* Devices can read a queue by name
-* Devices can read the X most recent messages
-* Devices can read messages since XXX
-* Messages are encoded as JSON