Kubernetes Support#

Jina is a cloud-native framework and therefore runs natively and easily on Kubernetes. Deploying a Jina Flow on Kubernetes is actually the recommended way to use Jina in production.

A Flow is composed of different microservices called Executors which natively run in containers. This means that Kubernetes can natively take over the lifetime management of Executors.

Deploying a Flow on Kubernetes means wrapping these microservice containers in the appropriate K8s abstraction (Deployment, StatefulSet, and so on), exposing them internally via K8s service and connecting them together by passing the right set of parameters.

Automatically translate a Flow to Kubernetes concept#

Hint

Manually building these Kubernetes YAML object is long and cumbersome. Therefore we provide a helper function to_kubernetes_yaml() that does most of this translation work automatically.

This helper function can be called from:

  • Jina’s Python interface to translate a Flow defined in Python to K8s YAML files

  • Jina’s CLI interface to export a YAML Flow to K8s YAML files

See also

More detail in the Flow export documentation

Extra Kubernetes options#

In general, Jina follows a single principle when it comes to deploying in Kubernetes: You, the user, know your use case and requirements the best. This means that, while Jina generates configurations for you that run out of the box, as a professional user you should always see them as just a starting point to get you off the ground.

Hint

The export funcitonto_kubernetes_yaml() is a helper function to get your stared off the ground. There are meant to be updated and adapted to every use case

Matching Jina versions

If you change the Docker images for Executor and Gateway in your Kubernetes-generated file, ensure that all of them are built with the same Jina version to guarantee compatibility.

You can’t add basic Kubernetes features like Secrets, ConfigMap or Labels via the Pythonic or YAML interface. This is intentional and doesn’t mean that we don’t support these features. On the contrary, we let you fully express your Kubernetes configuration by using the Kubernetes API to add your own Kubernetes standard to Jina.

Hint

We recommend you dump the Kubernetes configuration files and then edit them to suit your needs.

Here are possible configuration options you may need to add or change

  • Add labels selectors to the Deployments to suit your case

  • Add requests and limits for the resources of the different Pods

  • Set up persistent volume storage to save your data on disk

  • Pass custom configuration to your Executor with ConfigMap

  • Manage credentials of your Executor with secrets

  • Edit the default rolling update configuration

Required service mesh#

Caution

A Service Mesh is required to be installed and correctly configured in the K8s cluster in which your deployed your Flow.

Service meshes work by attaching a tiny proxy to each of your Kubernetes Pods, allowing for smart rerouting, load balancing, request retrying, and host of other features.

Jina relies on a service mesh to load balance requests between replicas of the same Executor. You can use your favourite Kubernetes service mesh in combination with your Jina Flow, but the configuration files generated by to_kubernetes_config() already include all necessary annotations for the Linkerd service mesh.

Hint

You can use any service mesh with Jina, but Jina Kubernetes configurations come with Linkerd annotations out of the box.

To use Linkerd you can follow the install the Linkerd CLI guide.

Caution

Many service meshes can perform retries themselves. Be careful about setting up service mesh level retries in combination with Jina, as it may lead to unwanted behaviour in combination with Jina’s own retry policy.

Instead, you can disable Jina level retries by setting Flow(retries=0) in Python, or retries: 0 in the Flow YAML’s with block.

Scaling Executors: Replicas and shards#

Jina supports two types of scaling:

  • Replicas can be used with any Executor type and are typically used for performance and availability.

  • Shards are used for partitioning data and should only be used with indexers since they store state.

Check here for more information about these scaling mechanisms.

For shards, Jina creates one separate Deployment in Kubernetes per Shard. Setting f.add(..., shards=num_shards) is sufficient to create a corresponding Kubernetes configuration.

For replicas, Jina uses Kubernetes native replica scaling and relies on a service mesh to load-balance requests between replicas of the same Executor. Without a service mesh installed in your Kubernetes cluster, all traffic will be routed to the same replica.

See Also

The impossibility of load balancing between different replicas is a limitation of Kubernetes in combination with gRPC. If you want to learn more about this limitation, see this Kubernetes Blog post.

Scaling the Gateway#

The Gateway is responsible for providing the API of the Flow. If you have a large Flow with many Clients and many replicated Executors, the Gateway can become the bottleneck. In this case you can also scale up the Gateway deployment to be backed by multiple Kubernetes Pods. This is done by the regular means of Kubernetes: Either increase the number of replicas in the generated yaml configuration files or add replicas while running. To expose your Gateway replicas outside Kubernetes, you can add a load balancer as described here.

Hint

You can use a custom Docker image for the Gateway deployment by setting the envrironment variable JINA_GATEWAY_IMAGE to the desired image before generating the configuration.

See also#