How to perform rate limiting in GraphQL APIs by calculating query complexity

Rate limiting is a process of controlling the rate at which clients can access an API (Application Programming Interface).

Overall, rate limiting is an essential technique for managing API usage and ensuring that an API is available to all clients in a fair and secure manner and it's an important feature of modern software development where APIs run the world!

In this article I've given you code snippets you can use in your app to prevent these vulnerabilities. Use our boilerplate code as you see fit, but please test it thoroughly for functional and security bugs before shipping to prod. By using any code provided here, you accept that it's provided for educational purposes and we cannot be held responsible for any errors, ommissions or liability for your use of this code or any concepts discussed here.

Why rate limit APIs in my applications?

The main purpose of rate limiting is to prevent abusive or excessive usage of an API by limiting the number of requests made by a client within a specific period of time.

But broadly the reasons for limiting API access rates generally fall into these categories:

Preventing Overload: Without rate limiting, an API can be overwhelmed by a large number of requests from a client. This can cause the API to slow down or even crash, affecting other clients that rely on it.
Ensuring Fair Usage: Rate limiting ensures that all clients have equal access to an API. This is important when the API has limited resources or when the resources are expensive to maintain.
Protecting Against Malicious Queries: Rate limiting can protect an API against DDoS (Distributed Denial of Service) attacks by limiting the number of requests a client can make within a specific period of time.
Protecting Your IP: Scrapers can abuse your APIs to steal your data. Why let them do this?
Reducing Costs: APIs can incur costs based on the number of requests made by clients. By limiting the number of requests, rate limiting can help reduce these costs.

What are the traditional API rate limiting techniques?

The traditional API rate limiting techniques include:

IP-based rate limiting: This technique limits the number of requests that can be made from a specific IP address within a certain period of time. This is a simple way to prevent abusive requests from a single IP address, but it may not be effective against distributed attacks or when multiple users share the same IP address.
Token-based rate limiting: This technique limits the number of requests that can be made by a specific user or application within a certain period of time. Each user or application is assigned a unique token that is used to track their request rate.
Quota-based rate limiting: This technique limits the total number of requests that can be made by all users or applications within a certain period of time. This can be useful for preventing overall API usage from exceeding a maximum number, but it may not be as effective for preventing abusive usage from individual users or applications.
Time-based rate limiting: This technique limits the number of requests that can be made within a certain period of time, such as per second, per minute, or per hour. This is a simple and effective rate limiting technique, but does not take into account the specific needs of different users or applications.

Certain frameworks have their own in-built mechansisms for rate-limiting. For example rate limiting in Flutter apps can be done via an in-built "debounce" method.

Because GraphQL allows your users to access more complex data than a traditional REST API, these standard rate limiting techniques don't always work.

This is because complex data sets have different cost bases for you to access, organise and serve that data to your users.

For instance, as Shopify notes:

POST, PUT, PATCH and DELETE requests produce side effects that demand more load on servers than GET requests, which only reads existing data. Despite the difference in resource usage, all these requests consume the same amount of credits in the request-based model.

So is there a better way? Yes, it's called the Calculated Query Cost Method (CQCM).

What is the Calculated Query Cost Method for GraphQL API rate limiting?

The CQCM for GraphQL API rate limiting is an example a type of dynamic rate limiting directives. It's a technique used to limit the rate of requests made to a GraphQL API based on the computational cost of each query.

In this method, each query is assigned a cost value based on the complexity of the query, such as the number of fields requested or the depth of the query. The cost of each query is then aggregated to calculate the total cost of the request. The API then compares the total cost to a predefined rate limit and either allows or denies the request based on the result.

The advantage of this method is that it allows for more fine-grained control over API usage, as it takes into account the actual computational cost of each request rather than simply counting the number of requests.

This can help prevent abusive or inefficient queries that consume a lot of resources, while still allowing clients to make a reasonable number of queries.

What are the difficulties with the Calculated Query Cost Method for a GraphQL API rate limit?

No one solution is a silver bullet to your API security, access control issues or cost minimization goals.

So naturally, the CQCM approach has some drawbacks:

Implementing this method can be complex, as it requires tracking the cost of each query and aggregating it to calculate the total cost.
Calculating the cost of a query can be subjective and may require some tuning to ensure that it accurately reflects the computational cost of the query.
Calculating the cost of a query requires additional processing, which can add overhead to the API and affect performance. This overhead may be significant in scenarios where queries are highly complex or the rate of requests is high.
Setting appropriate rate limits can be difficult, as it requires balancing the need to prevent abusive or inefficient queries with the need to allow legitimate queries.

How can I define the query cost for types based on the amount of data it requests?

It can be done by assigning a cost value to each field in a GraphQL schema based on the amount of data it requests.

This cost value is typically defined based on factors such as the complexity of the field, the number of database queries required to retrieve the data, or the amount of processing required to compute the field.

Let's classify the data into 5 types:

Scalar & Enums - these are strings, integers, IDs, and booleans. Simple pieces of data that have little overhead.
Objects - these usually consist of multiple scalars/enums and require a database query or a request to an internal service.
Connections - a list of objects returned by your GraphQL API requests.
Interfaces & unions - these are similar ot objects so treat them accordingly.
Mutations - these not only return an object or connection, but also trigger a workflow which has a much higher cost because of the resource intensive nature of the query and the load placed on your servers.

What value should assigned to a scalar/enum?

A simple scalar field may be assigned a cost value of 1 point, or even 0, because querying such types usually has very little cost associated with it.

What value should assigned to an object and interfaces/unions?

An object will generally be your "base unit" and therefore you can assign it a cost of 5 points.

This reflects that objects are significantly more costly to access, transform and present than scalars and enums.

The following is an example of how Shopify costs their GraphQL API queries using the CQCM:

	
    	query {
          shop {                  # Object  - 1 point
            id                    # ID      - 0 points
            name                  # String  - 0 points
            timezoneOffsetMinutes # Int     - 0 points
            customerAccounts      # Enum    - 0 points
          }
        }

What value should assigned to a connection?

Because connections have a one-to-many relationship in GraphQL, you should include the cost per object plus a premium for accessing connection.

For example if your premium is 2 points and your request includes 4 objects then your cost for this connection request is 22 points, which consists of:

5 points per object = 5 x 4 = 20 points
2 points for the connection premium

What value should assigned to a mutation?

The added advantage of using CQCM is that you can place more appropriate costs based on query depth, because you already know what other workflows will be triggered based on the mutuation requested.

Because of the server load generated by mutations, you should generally place a 2-5x cost on mutuations as compared to an object.

Shopify places a 10x cost on their mutations, to give you an example of how this is a really subjective area where you'll need to do your own tinkering and analysis.

Communicating query cost information in a GraphQL query

If you're implementing the CQCM in your application then it makes sense to let your users know exactly where they stand and how much they left in their bank.

Otherwise your GraphQL rate limiting efforts may reduce your cost and chance of a denial-of-service attack, but it will also end up frustrating your users.

You should include an extension object that are returned when GraphQL requests are made.

You don't have to mention the individual costs of each component of the GraphQL API request (you could if you wanted to be that transparent), but it is logical to provide at least this data:

Requested query cost - how much that GraphQL API request would have cost.
Actual query cost - how much it actually costed the user.
Maximum points available - how many total points can a user accumulate (this is usually capped at whatever you analyse to be a reasonable limit).
Current points bank - how many points available to them after running this query to help users understand how many other queries they can make in the current rate limit window.
Restore rate - how quickly the points are restored to the user's points bank.

Shopify presents this information in this format as part of one GraphQL query response:

	
    	{
        "data": {
          "products": {
            "edges": [
              {
                "node": {
                  "title": "Low inventory product"
                }
              }
            ]
          }
        },
        "extensions": {
          "cost": {
            "requestedQueryCost": 7,
            "actualQueryCost": 3,
            "throttleStatus": {
              "maximumAvailable": 1000.0,
              "currentlyAvailable": 997,
              "restoreRate": 50.0
            }
          }
        }
      }

What is the difference between requested and actual query cost?

The common reason the actual query cost might differ from the requested cost is because fewer record were returned from the GraphQL server than expected for that GraphQL query.

In such instances you should reimburse the users' points bank with the difference and this schema allows you to communicate to your users that you are doing just that.

How can you measure the effectiveness of the calculated query cost model for rate limiting GraphQL API?

In essence the reason you are implementing the CQCM is to align GraphQL queries with query complexity.

So the best way to understand if your cost model is working is to map query execution time (y-axis) against the cost you place on your GraphQL queries (x-axis).

Shopify again presents this in a really useable and understandable format using a scatterplot like this:

Like the Shopify example above, you should see a strong trendline resembling an x=y curve.

Don't forget to examine outlier points to understand what's happening, because that is how you can optimise your rate limits and cost model.

How to test security of GraphQL APIs?

There are very few tools that allow you to protect your GraphQL APIs against hackers. However, Cyber Chief is one such tool that allows you to run regular vulnerability scans with a GraphQL vulnerability scanning tool.

Start your free trial of Cyber Chief now to see not only how it can help to keep attackers out, but also to see how you can ensure that you ship every release with zero known vulnerabilities.

Or, if you prefer to have an expert-vetted vulnerability assessment performed on your Node.js application you can order the vulnerability assessment from here. Each vulnerability assessment report comes with:

Results from scanning your application for the presence of OWASP Top 10 + SANS CWE 25 + thousands of other vulnerabilities.
A detailed description of the vulnerabilities found.
A risk level for each vulnerability, so you know which GraphQL endpoints to fix first.
Best-practice fixes for each vulnerability, including code snippets where relevant.
One-month free access to our Cyber Chief API & web application security testing tool.
Email support from our application security experts.

Which option do you prefer?

Start Cyber Chief Free Trial

Order Vulnerability Assessment

Tuesday, February 28, 2023