Kubernetes Support#

Jina is a cloud-native framework and therefore runs natively and easily on Kubernetes. Deploying a Jina Deploymenr or Flow on Kubernetes is actually the recommended way to use Jina in production.

A Deployment and Flow are services composed of single or multiple microservices called Executor and Gateways which natively run in containers. This means that Kubernetes can natively take over the lifetime management of Executors.

Deploying a Deployment or ~jina.Flow on Kubernetes means wrapping these services containers in the appropriate K8s abstraction (Deployment, StatefulSet, and so on), exposing them internally via K8s service and connecting them together by passing the right set of parameters.


This documentation is designed for users who want to manually deploy a Jina project on Kubernetes.

Check out Jina AI Cloud Hosting if you want a one-click solution to deploy and host Jina, leveraging a cloud-native stack of Kubernetes, Prometheus and Grafana, without worrying about provisioning.

Automatically translate a Deployment or Flow to Kubernetes concept#


Manually building these Kubernetes YAML object is long and cumbersome. Therefore we provide a helper function to_kubernetes_yaml() that does most of this translation work automatically.

This helper function can be called from:

  • Jina’s Python interface to translate a Flow defined in Python to K8s YAML files

  • Jina’s CLI interface to export a YAML Flow to K8s YAML files

Extra Kubernetes options#

In general, Jina follows a single principle when it comes to deploying in Kubernetes: You, the user, know your use case and requirements the best. This means that, while Jina generates configurations for you that run out of the box, as a professional user you should always see them as just a starting point to get you off the ground.


The export function to_kubernetes_yaml() and to_kubernetes_yaml() are helper functions to get your stared off the ground. There are meant to be updated and adapted to every use case

Matching Jina versions

If you change the Docker images for Executor and Gateway in your Kubernetes-generated file, ensure that all of them are built with the same Jina version to guarantee compatibility.

You can’t add basic Kubernetes features like Secrets, ConfigMap or Labels via the Pythonic or YAML interface. This is intentional and doesn’t mean that we don’t support these features. On the contrary, we let you fully express your Kubernetes configuration by using the Kubernetes API to add your own Kubernetes standard to Jina.


We recommend you dump the Kubernetes configuration files and then edit them to suit your needs.

Here are possible configuration options you may need to add or change

  • Add labels selectors to the Deployments to suit your case

  • Add requests and limits for the resources of the different Pods

  • Set up persistent volume storage to save your data on disk

  • Pass custom configuration to your Executor with ConfigMap

  • Manage credentials of your Executor with Kubernetes secrets, you can use f.add(..., env_from_secret={'SECRET_PASSWORD': {'name': 'mysecret', 'key': 'password'}}) to map them to Pod environment variables

  • Edit the default rolling update configuration

Required service mesh#


A Service Mesh is required to be installed and correctly configured in the K8s cluster in which your deployed your Flow.

Service meshes work by attaching a tiny proxy to each of your Kubernetes Pods, allowing for smart rerouting, load balancing, request retrying, and host of other features.

Jina relies on a service mesh to load balance requests between replicas of the same Executor. You can use your favourite Kubernetes service mesh in combination with your Jina services, but the configuration files generated by to_kubernetes_yaml() already include all necessary annotations for the Linkerd service mesh.


You can use any service mesh with Jina, but Jina Kubernetes configurations come with Linkerd annotations out of the box.

To use Linkerd you can follow the install the Linkerd CLI guide.


Many service meshes can perform retries themselves. Be careful about setting up service mesh level retries in combination with Jina, as it may lead to unwanted behaviour in combination with Jina’s own retry policy.

Instead, you can disable Jina level retries by setting Flow(retries=0) in Python, or retries: 0 in the Flow YAML’s with block.

Scaling Executors: Replicas and shards#

Jina supports two types of scaling:

  • Replicas can be used with any Executor type and are typically used for performance and availability.

  • Shards are used for partitioning data and should only be used with indexers since they store state.

Check here for more information about these scaling mechanisms.

For shards, Jina creates one separate Deployment in Kubernetes per Shard. Setting Deployment(..., shards=num_shards) is sufficient to create a corresponding Kubernetes configuration.

For replicas, Jina uses Kubernetes native replica scaling and relies on a service mesh to load-balance requests between replicas of the same Executor. Without a service mesh installed in your Kubernetes cluster, all traffic will be routed to the same replica.

See Also

The impossibility of load balancing between different replicas is a limitation of Kubernetes in combination with gRPC. If you want to learn more about this limitation, see this Kubernetes Blog post.

Scaling the Gateway#

The Gateway is responsible for providing the API of the Flow. If you have a large Flow with many Clients and many replicated Executors, the Gateway can become the bottleneck. In this case you can also scale up the Gateway deployment to be backed by multiple Kubernetes Pods. For this reason, you can add replicas parameter to your Gateway before converting the Flow to Kubernetes. This can be done in a Pythonic way or in YAML:

You can use config_gateway() to add replicas parameter

from jina import Flow

f = Flow().config_gateway(replicas=3).add()


You can add replicas in the gateway section of your Flow YAML

jtype: Flow
    replicas: 3
  - name: encoder

Alternatively, this can be done by the regular means of Kubernetes: Either increase the number of replicas in the generated yaml configuration files or add replicas while running. To expose your Gateway replicas outside Kubernetes, you can add a load balancer as described here.


You can use a custom Docker image for the Gateway deployment by setting the environment variable JINA_GATEWAY_IMAGE to the desired image before generating the configuration.

See also#