675f6933ba
Change-Id: Ib33d0c65d025915142f4cee23e1f47a4b1f924bd
214 lines
8.9 KiB
ReStructuredText
214 lines
8.9 KiB
ReStructuredText
::
|
|
|
|
Copyright (c) 2014 Hewlett-Packard Development Company, L.P.
|
|
|
|
This work is licensed under a Creative Commons Attribution 3.0
|
|
Unported License.
|
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
|
|
|
..
|
|
This template should be in ReSTructured text. Please do not delete
|
|
any of the sections in this template. If you have nothing to say
|
|
for a whole section, just write: "None". For help with syntax, see
|
|
http://sphinx-doc.org/rest.html To test out your formatting, see
|
|
http://www.tele3.cz/jbar/rest/rest.html
|
|
|
|
====================================
|
|
StoryBoard: Subscriptions and Events
|
|
====================================
|
|
|
|
https://storyboard.openstack.org/#!/story/96
|
|
|
|
StoryBoard needs to notify a user when changes occur to a resource which
|
|
they have decided to be notified about.
|
|
|
|
A key feature needed by all ticket tracking systems is the ability to
|
|
notify a user when topics which they care about have changed. A common way
|
|
to describe this is "Subscription", where a single user will ask to be
|
|
notified about certain types of changes for certain resources. More
|
|
complicated implementations include filtering, notification alert levels,
|
|
summaries of events that all impact the same resource,
|
|
or automatic notification based on inferred or calculated relevance.
|
|
|
|
Problem description
|
|
===================
|
|
|
|
In its simplest form, we would like each user to be able to indicate which
|
|
resources they are interested in, and be able to retrieve a date-sorted
|
|
list of which of those resources have changed recently. This will then be
|
|
shown in the UI as a list of events that occurred that are relevant to the
|
|
user. More advanced features would be the ability to filter these events,
|
|
or to receive notification of new events in "real time".
|
|
|
|
Requirements are as follows:
|
|
* A user should be able to manage their subscriptions
|
|
* Subscriptions should be as up-to-date as possible with as little data
|
|
loss as possible.
|
|
* A user should be able to subscribe to tasks, stories, projects,
|
|
and project groups.
|
|
* Subscriptions should have a minimal impact on API performance.
|
|
* When a subscription is added, all changes from that point forward should
|
|
be reported. Historical changes do not need to be generated.
|
|
* When a subscription is removed, a user's subscription list does not have
|
|
to be recalculated to extract no-longer-relevant events.
|
|
* StoryBoard should be able to run on a single server with no crazy
|
|
additional infrastructure required.
|
|
* Oslo libraries should be taken into account.
|
|
|
|
Subscriptions are the classic, personalized, pub-sub problem, where a user's
|
|
list of subscriptions can be large, and the matrix of events that could
|
|
cause a subscription to be notified can be complex and processor intensive.
|
|
Classically, there are four ways to handle this problem: Push, Pull,
|
|
Async, or JIT.
|
|
|
|
Push
|
|
----
|
|
The push approach assumes that you will notify all subscribers when the
|
|
event occurs. In the case of our API, this means that during a
|
|
PUT/POST/DELETE, all subscribers to that resource are checked to see
|
|
whether they need to be notified. In most small-scale system this works
|
|
fine, however as subscriptions increase this approach is generally not
|
|
stable, as the time required to process the request (and the number of
|
|
errors that could possibly be raised) will start to impact the client
|
|
making the API request. Things that would raise concerns are timeout,
|
|
as well as the number of points of failure, and as a result of that this is
|
|
not an appropriate approach.
|
|
|
|
Pull
|
|
----
|
|
The pull approach believes in restricting the processing load described
|
|
above to only those users who care enough to ask for subscription events.
|
|
In this case, a list is generated fresh every time GET /subscriptions (or
|
|
similar) is called. This approach is appropriate when usage is expected to
|
|
be low, similar to the generation of a report. Given that a subscription
|
|
list is a fairly frequently polled resource, this is not an appropriate
|
|
approach.
|
|
|
|
Async
|
|
-----
|
|
An asynchronous approach makes use of deferred, distributed processing to
|
|
"eventually" update a user's list of subscription events. Worker management
|
|
systems such as Gearman or Rabbit are notified whenever a resource changes
|
|
(likely via Pecan API hooks), and they 'eventually' ask a worker process to
|
|
go figure out which subscriptions need to be updated. Advantages of this
|
|
approach is that we can avoid the Python Global Interpreter Lock by having
|
|
separate worker processes, and any errors encountered during subscription
|
|
processing will be isolated and thus not impact the actual API request. The
|
|
challenge with this is that most queueing/worker management systems are
|
|
resource intensive (kafka), do not guarantee delivery (gearman),
|
|
or have known issues with split-brain clustering (rabbit). Any approach
|
|
will have to accommodate the chosen system.
|
|
|
|
Streaming
|
|
---------
|
|
A streaming approach begins by emitting an event whenever a resource
|
|
changes, and to notify all subscribers that are currently connected via a
|
|
socket. Persistence of events may be handled by creating
|
|
individual processes that listen to the stream and persist the received
|
|
data much like a subscribed client might update its UI. This approach
|
|
solves the real-time problem with a hot sexy technology, however coordinating
|
|
listeners and ensuring that persistence is handled properly raises the same
|
|
problems as the Async or Push/Pull problem. As a result this approach is
|
|
unnecessarily more complex than it needs to be.
|
|
|
|
|
|
Proposed change
|
|
===============
|
|
|
|
Summary
|
|
-------
|
|
StoryBoard will emit events whenever a resource changes. Since most
|
|
resources map directly to database changes, the majority of these changes can
|
|
be handled via Pecan post-request hooks.
|
|
|
|
Events, when emitted, will be written to a deferred processing queue. If
|
|
the queue is unavailable or misconfigured, a warning should be written to
|
|
the log, however the system should complete the original request normally
|
|
and discard the change event altogether.
|
|
|
|
The queuing system to be used is RabbitMQ, because it guarantees delivery
|
|
and recovers after a crash. This decision is based on the assumption that
|
|
the number of events dispatched by the database will not be sufficient to
|
|
require a full RabbitMQ cluster, which means we don't have to worry about
|
|
split-brain problems.
|
|
|
|
The StoryBoard server will spin up a series of processes that listen to the
|
|
event queue and perform actions based on the type of event received. In the
|
|
case of subscriptions, a process would read the event,
|
|
load the impacted resource and its change, search for any subscriptions to
|
|
the impacted resource, update each subscription's owners' subscription feed,
|
|
and then notify rabbit that the message has been received and processed. If
|
|
updating one subscription fails, the process should still attempt to
|
|
complete the other ones.
|
|
|
|
The number of events that are retained per user should be configurable by
|
|
age, with a default of 1 month.
|
|
|
|
A user's event feed should be retained in the database in its own table.
|
|
|
|
API: Subscriptions
|
|
------------------
|
|
StoryBoard will expose a new endpoint at /v1/users/ID/subscriptions. This
|
|
endpoint will support basic CRUD operations by which a user can manage
|
|
their subscriptions.
|
|
|
|
API: Feed
|
|
---------
|
|
StoryBoard will expose a new endpoint at /v1/users/ID/feed. This endpoint
|
|
will provide a list of events that have been emitted by a user's
|
|
subscriptions, sorted by date in descending order.
|
|
|
|
Alternatives
|
|
------------
|
|
Alternative approaches have been listed in the problem description.
|
|
Alternative queueing systems include ZeroMQ & gearman, which was disqualified
|
|
because it does not guarantee delivery, Kafka which was disqualified
|
|
because (anecdotally) it requires a cluster to perform properly.
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Assignee(s)
|
|
-----------
|
|
|
|
Primary Assignee:
|
|
TBD
|
|
|
|
Work Items
|
|
----------
|
|
* Create an API to add subscriptions for projects, project groups, stories,
|
|
and tasks.
|
|
* Teach the storyboard-webclient to allow subscription on projects, project
|
|
groups, stories, and tasks.
|
|
* Install RabbitMQ on StoryBoard Server.
|
|
* Use Oslo.messaging to create an SQLAlchemy hook that broadcasts change
|
|
events for project groups, projects, stories, and tasks.
|
|
* Add configuration to StoryBoard for the AMQP connection string and
|
|
optionally an enabling flag for the whole feature.
|
|
* Create a storyboard-worker process that connects to AMQP and receives
|
|
messages for processing.
|
|
* Create a way for the storyboard-worker process to process lots of
|
|
different kinds of events (event hooks of some sort? processor factory?)
|
|
* Build a subscription event handler which is run by storyboard-worker and
|
|
updates a subscriber's feed.
|
|
* Create an API endpoint that exposes the feed.
|
|
* Teach the storyboard-webclient to display the feed.
|
|
|
|
Repositories
|
|
------------
|
|
No new repositories.
|
|
|
|
Servers
|
|
-------
|
|
No new servers. storyboard.openstack.org will need to have a running
|
|
RabbitMQ instance.
|
|
|
|
DNS Entries
|
|
-----------
|
|
No new DNS entries.
|
|
|
|
Dependencies
|
|
============
|
|
See above. Puppet module for storyboard will need to be updated. Additional
|
|
dependencies are on oslo.messaging, rabbitmq-server, upstart, etc.
|