Services/F1/Server/ServicesStatusDB: Difference between revisions

no edit summary
(Created page with "= Services Status DB = The DB is a key/value storage, and stores for each Service: * a status ratio * a disabled flag * a retry-after value The DB is replicated in several pla...")
 
No edit summary
Line 1: Line 1:
= Services Status DB =
= Goal =


When a third party service like Twitter gets down or is starting to be very slow, clients will retry to send to our server more and more requests and our infrastructure will be overloaded and potentially unresponsive.
The goal of the Services Status DB is to provide to every web server in our infrastructure a status of every third party service. The web server can decide to back off a request in case the service is down, and ask the client to retry after.
= Principle =
1. On every request the client adds a X-Target-Service header containing the domain of the service it wants to reach.
  For example, if the clients want to share on Twitter, a "X-Target-Service: twitter.com" is added.
2. The web server (NGinx) that receives the query ask the Services DB what is the status of the service (as described later) and decide if the query should go through or not.
3. If the request is rejected, the client receives a 503 + Retry-After header and has to wait the time provided before it retries.
4. In case the request is accepted, it is passed to the upstream server (Python) that does the job
5. If the upstream server succeeds, it notifies asynchronously the Services DB
6. If the upstream server fails. e.g. if the third party service is considered down, it notifies asynchrously the Services DB
= Client UX on outage =
= Database
The DB is a key/value storage, and stores for each Service:
The DB is a key/value storage, and stores for each Service:


Confirmed users
927

edits