Rate Limit#

There are two ways of applying a rate limit using the Client.

  1. Set using the Client class constructor and defaults to 1,000 requests.

  2. Set the argument when using post() method. If not provided, the default value of 1,000 requests will be used. The method argument will override the argument provided in the Client class constructor.

The prefetch argument controls the number of in flight requests made by the post() method. Using the default value might overload the Gateway or Executor especially if the operation characteristics of the Deployment or Flow are unknown. Furthermore the Client can send various types of requests which can have varying resource usage.

For example, a high number of index requests can contain a large data payload requiring high input/output operation. This increases CPU consumption and eventually lead to a build up of the requests on the Flow. If the queue of in-flight requests is already large, a very light weight search request to return the total number of Documents in the index might be blocked until the queue of index requests can be completely processed. To prevent such a scenario, apply the prefetch value on the post() method to limit the rate of requests for expensive operations.

Apply the prefetch argument on the post() method to dynamically increase the server responsiveness for customer-facing requests which require faster response times vs. background requests such as cronjobs or analytics requests which can be processed slowly.

from jina import Client

client = Client()

# uses the default limit of 1,000 requests
search_responses = client.post(...)

# sets a hard limit of 5 in flight requests
index_responses = client.post(..., prefetch=5)

A global rate limit on the Gateway can also be set using the prefetch option in the Flow. This argument however serves as a global rate limit and cannot be customized based on the request workload. The prefetch argument for the Client serves as a class level rate limit for all requests made from the client. The prefetch argument for the post() method serves as a method level overriding the arguments at the Client and the Flow.