L10n:Pontoon/API

From MozillaWiki
< L10n:Pontoon
Revision as of 18:27, 30 August 2017 by StasM (talk | contribs)
Jump to navigation Jump to search

Italic text'For now, this page serves as a scratchpad for documenting the research into different API solutions for Pontoon. Once one solution is chosen and implemented, this page will feature the documentation about this solution.

High-level Q3 2017 goal: Create an API endpoint supporting queries related to aggregate statistics per locale and per project

Discussion

See https://groups.google.com/forum/#!topic/mozilla.tools.l10n/R1S7Pk-c6uU for more discussion on this topic.

Roadmap

In Q3 2017, we'd like to make some data stored in Pontoon openly available for third-parties. The main driver is the use case from bug 1302053:

  • Stats for a locale: supported projects, status of each project.
  • Stats for a project: supported locales, incomplete locales, complete locales.

In future iterations, more use-case can be supported:

  • Exposing data which can be fetched by a SPA front-end
    • This will likely require pagination
  • Getting the stream of notifications per authorized user

Technology

We'll be considering three solutions: REST, GraphQL and GraphQL with Relay.

REST

REST has been the de facto standard of API design for the last 10-15 years.

Pros

Cons

  • By default, all fields as decided by the developer, are exposed and transferred, resulting in increased bandwidth
    • Work-arounds exists, e.g. &fields=foo,bar
  • Only the relations expected by the developer can be queried in a single query, e.g. project/1/locales
    • Other relations require multiple requests, which can't be optimized
  • Requires versioning and documentation


GraphQL

GraphQL is a query language in which the consumer describes the shape of the data they want back.

Pros

  • Easy to learn syntax
  • Documentation generated out-of-the-box
  • GUI tool for browsing the API with a docs explorer (GraphiQL)
  • The consumer specifies exactly which fields they're interested in
  • A single query can span multiple types as long as they are connected in the graph
  • graphene_django automates a lot of integration, including support for Enum types

Cons

  • Circular queries are possible ({ projects { locales { projects } } })
  • Optimizations relying on prefetch_selected can be brittle.
    • I'm still trying to understand exactly what happens.
    • The best place to optimize seems to be the top-level Query Type.
    • For instance, when querying a list of projects, I can ProjectModel.objects.prefetch_related('project_locale__locale') in the top-level query in order to anticipate that the consumer will want to see the information about the related locales. In Django terms, this implies project.project_locale.all() which means that I now have to use all() in resolve_locales in the Project GraphQL type. Which in turn means that when asking for a single Project, I can't prefetch_related in its resolve_locales. The work-around is to prefetch_related in the top-level query for the single Project too.
    • The optimizations can be added dynamically depending on the exact query thanks to the introspection. This is similar to the approach to preventing circular queries

GraphQL with Relay

Relay is a specification for cursor-based pagination which solves the problem of omitting items when switching between pages if items are being added quickly in real time to the DB. It works great for Facebook's use-case of showing a feed of news and updates.

Pros

  • Pagination is guaranteed to not omit items which have been added to the DB while the user was looking at one page and then switched to another one
  • Relay has good integration with React
  • It's becoming a standard for pagination in GraphQL

Cons