L10n:Pontoon/API: Difference between revisions
mNo edit summary |
mNo edit summary |
||
Line 1: | Line 1: | ||
''For now, this page serves as a scratchpad for documenting the research into different API solutions for Pontoon. Once one solution is chosen and implemented, this page will feature the documentation about this solution.'' | ''Italic text''''For now, this page serves as a scratchpad for documenting the research into different API solutions for Pontoon. Once one solution is chosen and implemented, this page will feature the documentation about this solution.'' | ||
High-level Q3 2017 goal: Create an API endpoint supporting queries related to aggregate statistics per locale and per project | High-level Q3 2017 goal: Create an API endpoint supporting queries related to aggregate statistics per locale and per project | ||
= | =Discussion= | ||
See https://groups.google.com/forum/#!topic/mozilla.tools.l10n/R1S7Pk-c6uU for more discussion on this topic. | See https://groups.google.com/forum/#!topic/mozilla.tools.l10n/R1S7Pk-c6uU for more discussion on this topic. |
Revision as of 18:27, 30 August 2017
Italic text'For now, this page serves as a scratchpad for documenting the research into different API solutions for Pontoon. Once one solution is chosen and implemented, this page will feature the documentation about this solution.
High-level Q3 2017 goal: Create an API endpoint supporting queries related to aggregate statistics per locale and per project
Discussion
See https://groups.google.com/forum/#!topic/mozilla.tools.l10n/R1S7Pk-c6uU for more discussion on this topic.
Roadmap
In Q3 2017, we'd like to make some data stored in Pontoon openly available for third-parties. The main driver is the use case from bug 1302053:
- Stats for a locale: supported projects, status of each project.
- Stats for a project: supported locales, incomplete locales, complete locales.
In future iterations, more use-case can be supported:
- Exposing data which can be fetched by a SPA front-end
- This will likely require pagination
- Getting the stream of notifications per authorized user
Technology
We'll be considering three solutions: REST, GraphQL and GraphQL with Relay.
REST
REST has been the de facto standard of API design for the last 10-15 years.
Pros
- Easy to implement thanks to the Django REST Framework project
- Browsable API: http://restframework.herokuapp.com/
- Familiar to the consumers of the API
- The developer has the exact control over which fields and relations are exposed
Cons
- By default, all fields as decided by the developer, are exposed and transferred, resulting in increased bandwidth
- Work-arounds exists, e.g.
&fields=foo,bar
- Work-arounds exists, e.g.
- Only the relations expected by the developer can be queried in a single query, e.g.
project/1/locales
- Other relations require multiple requests, which can't be optimized
- Requires versioning and documentation
GraphQL
GraphQL is a query language in which the consumer describes the shape of the data they want back.
Pros
- Easy to learn syntax
- Documentation generated out-of-the-box
- GUI tool for browsing the API with a docs explorer (GraphiQL)
- The consumer specifies exactly which fields they're interested in
- A single query can span multiple types as long as they are connected in the graph
graphene_django
automates a lot of integration, including support for Enum types
Cons
- Circular queries are possible (
{ projects { locales { projects } } }
)- In order to avoid them, we'd need to write code that inspects the query itself and checks if the fields don't repeat deeper in the query tree
- See https://github.com/graphql-python/graphene/issues/348#issuecomment-267717809 and https://github.com/graphql-python/graphene/issues/462#issuecomment-298218524
- Optimizations relying on
prefetch_selected
can be brittle.- I'm still trying to understand exactly what happens.
- The best place to optimize seems to be the top-level Query Type.
- For instance, when querying a list of projects, I can
ProjectModel.objects.prefetch_related('project_locale__locale')
in the top-level query in order to anticipate that the consumer will want to see the information about the related locales. In Django terms, this impliesproject.project_locale.all()
which means that I now have to useall()
inresolve_locales
in the Project GraphQL type. Which in turn means that when asking for a single Project, I can'tprefetch_related
in itsresolve_locales
. The work-around is toprefetch_related
in the top-level query for the single Project too. - The optimizations can be added dynamically depending on the exact query thanks to the introspection. This is similar to the approach to preventing circular queries
GraphQL with Relay
Relay is a specification for cursor-based pagination which solves the problem of omitting items when switching between pages if items are being added quickly in real time to the DB. It works great for Facebook's use-case of showing a feed of news and updates.
Pros
- Pagination is guaranteed to not omit items which have been added to the DB while the user was looking at one page and then switched to another one
- Relay has good integration with React
- It's becoming a standard for pagination in GraphQL
Cons
- Pontoon's data doesn't change so quickly (projects, locales, entities) to actually require a solution this powerful.
- Translations and suggestions may change more quickly, however.
graphene_django
doesn't handle ManyToMany fields well with Relay enabled; by default thethrough
table adds another layer of edges to the graph, which becomes verbose very quickly- Suffers from the N+1 queries problem for ForeignKeys and ManyToMany relationships
- De-optimizes
prefetch_related
andselect_related