Raindrop/Hosted

Hosted Raindrop Setup

This document tries to describe the current state of the hosted raindrop setup. It's running on Amazon EC2 for convenience, but doesn't actually rely on anything Amazon specific, like S3.

System Classes

The entire system is composed of different system classes, each of them handling a set of specific tasks, and running on different instances. Each of these classes are currently running on a single EC2 instance, but can each scale independently. Keeping them at 1 right now is sufficient for our load, and simplifies things as well as keep running costs low.

Proxy

The proxies are what end-users browsers talk to directly. They are responsible for caching and routing requests to the right place.

On these, there are a few important services running:

Pound

Pound is a SSL reverse-proxy / load balancer used in our case for SSL termination. SSL requests are received by Pound who processes them and just blindly proxies them down to Varnish

pound.cfg

The only extra thing it does is set a X-SSL: On http requests so layers down can determine if a request was via SSL or not.

Varnish

Varnish is a blazing fast caching proxy, and is in place to be able to serve cached content to end-users, so we don't have to rely on the browsers cache all the time. For instance, API requests will be cached by varnish for a little while, so even if users shift-reloads, we won't take the performance hit more than every n seconds/minutes.

Varnish does a couple of simple things:

  • it sanitizes stuff and rejects bogus/invalid requests and the like.
  • we use the user's login cookie to store different copies of the same URLs per user, otherwise, we'd be leaking data between users.
  • couchdb adds a Cache-Control: must-revalidate to all responses, and we strip that from what we return to the client. Varnish will always revaildate, but browsers will not, this helps improve our warm cache behaviour.
  • varnish will catch 500/503 errors and retry request a fixed number of times. This way, if an API call fails, we'll transparently retry that request before returning an error back to the client. This way, we can transparently to the end-user handle transient errors.

varnish.vcl

/etc/sysconfig/varnish

Apache/mod_proxy/mod_perl

This is the routing layer itself. It's composed of Apache/mod_proxy to handle the actual proxying of requests, usually to couchdb. And of a small layer of mod_perl logic to handle authentication and the routing/dispatching itself.

We use Apache2::AuthenOpenID to handle the actual authentication. Once a user is openid authenticated, his identity it tracked with a cookie.

Once a user has been authenticated, we use the openid itself to retrieve information about that particular user (an instance of raindrop, really).

That information is currently stored in a couchdb database on the proxy machine itself, with a very simple user schema:

  "backends": [
      "http://domU-AB-CD-EF-00-11-22.compute-1.internal:5984/test-80db93"
  ],
  "current": 0,
  "active": true,
  "openids": [
      "www.google.com/accounts/o8/id?id=XXX",
      "www.google.com/accounts/o8/id?id=YYY",
      "www.google.com/accounts/o8/id?id=ZZZ"
  ],
  "e-mails": ["animalyouth@gmail.com"],
  "gearmans": [
      "domU-ab-cd-ef-01-23-45.compute-1.internal:4730",
      "domU-ab-cd-ef-01-23-78.compute-1.internal:4730"
  ]
  • backends is the list of couchdb servers holding that instance's raindrop. Currently there is no replication, but this is where the list of replica would be held
  • current is the index into the backends array of the current master couchdb instance
  • active is just a global flag that allows me to completely turn off one instance from the outside world
  • openids is an array of openids allowed to access that instance
  • e-mails isn't use for anything, but a handy way for me to know which instance belongs to whom
  • gearmans is the array of gearman servers to use for API calls (explained later)

The routing code basically looks up the userinfo from the logged in user's openid, looks at the requests, and picks what to do from:

  • Ordinary requests just end up being proxied to their couchdb database
  • /_utils/ requests end up being proxied to Futon
  • API requests are identified and dispatched via Gearman

The code is:

Processor

This is the part of the system that isn't directly involved with the normal web request flow. It's responsible for running tasks in the background :

  • run-raindrop.py sync-messages (on a schedule)
  • run-raindrop.py pipeline (continuous)
  • compaction (once a day)

Gearman

Gearman is a work queue and dispatch system. It is used to handle API requests.

The Front-ends transform API requests into gearman requests, sending them for processign to the gearman servers configured for the user's instance.

On the gearman instances, multiple instances of raindrop-apirunner.py are running, waiting for incoming API requests. They process queries and send back JSON payload. The Front-end then turns these back into proper http responses sent back to the client.

When looking at HTTP headers sent back from API requests, there is a convenient extra header added by the front-end X-Gearman-Elapsed that indicates how long it took to actually get the gearman response for that particular API call.

raindrop-apirunner.py has been modified to become a gearman client, and this hasn't made it back to mercurial yet

Couchdb

Our couchdb instances are currently running on OpenSolaris, for it's awesome performance and tracability. This is the simplest class. All it runs is CouchDB with one database per raindrop instances.

In the future, there will be multiple such instances (for scaling and redundancy) and replication between them for high availability.