The rise and fall of GraphQL at sennder
We started using GraphQL in the sennder engineering team in late 2016. Three years later, we have decided to stop using it.
Why we chose GraphQL in 2016
When we introduced GraphQL at sennder about three years ago, we had been running into performance problems with our REST API: When querying resources and related resources with REST, we needed to make a high number of queries even for simple data needs. This is a basic trade-off in any API where the response structure of endpoints is static. You either need a high number of endpoints for all different kinds of data needs, or you need a high number of queries to piece together the data on the front end, or you accept that you are significantly over-fetching data via very verbose endpoints.
GraphQL promised to side-step this conundrum: We only had to define a single GraphQL “node” for each resource, and then the front end could combine these nodes into a single query depending on its exact data needs.
It worked great.
The shortcomings of GraphQL
But then the company grew, and our business got more complex.
The amount of data in our database increased by multiple orders of magnitude. And with more employees, we needed to refine our access control mechanisms.
Thus, we ran into the two biggest shortcomings of GraphQL: Query performance and access control.
GraphQL queries tend to translate to inefficient database queries. By definition, the back end does not know which relations the front end might want to follow. Any naive implementation of translating the GraphQl query into database queries runs into the N+1 query problem. This can quickly cause performance issues, in particular when working with an ORM. Therefore, most optimization approaches try to dynamically analyze each GraphQL query. This is difficult, especially when filters are applied on the edges between resources.
The other problem with GraphQL is even harder to solve: Access control. Since the API user sends the complete GraphQL query to a single endpoint, this endpoint can only check if the user has access to API as a whole. Detailed access control has to be enforced when resolving relationships within the GraphQL query: When going from entity A to entity B, the API needs to check if the logged-in user is allowed to access entity B from entity A.
This is tough, because just knowing resource A and B often isn’t enough information to decide if the user should have access. Consider an “address” resource that is referenced by an “order” resource. What if a user should be able to see an address only if it belongs to a company that is referenced in an order that is associated with this user? Encoding such complex conditions on the resource level quickly turns into a significant performance problem in itself.
What comes after GraphQL?
We realized that there are more aspects to good API design than we had originally considered and created a new, extended list of requirements:
- Easy to reason about and define access control
- Easy to optimize database queries
- Front end is decoupled from back end
- Small number of API requests
- Concise API in terms of lines of code
- Ability to abstract API away from database structure
After weighing our options, we decided to switch to a command-query style API. It resembles a REST API in some aspects, but it distinguishes between commands and queries. It uses a one-endpoint-per-business-use-case strategy instead of the one-endpoint-per-resource approach in REST. This implies an increased number of endpoints and lines of code for us, which we accept as a necessary trade off.