How to deploy Jina on Kubernetes#
Jina natively supports deploying your Flow and Executors on Kubernetes.
A Flow
is composed of Executors
which run Python code
defined to operate on a DocumentArray
. These Executors live in different runtimes depending on how you want to deploy
your Flow.
When deployed in Kubernetes, these Executors will run inside Kubernetes Pods as containers and their lifetime will be handled by Kubernetes.
Deploying a Flow in Kubernetes is the recommended way of using Jina in production.
Jina relies on Service Mesh#
Jina’s Kubernetes support comes with a built-in integration for Linkerd as the service mesh for connecting all the Jina services in Kubernetes. All Jina deployment resources are annotated with the necessary Linkerd annotations. So we recommend every Jina user to install Linkerd into the Kubernetes cluster they are using Jina in.
Linkerd will automatically load balance traffic between all replicas of an Executor. It also has additional features.
It is also generally possible to use other service mesh providers (like Istio) with Jina. You need to manually change Jina’s deployment configurations in the generated YAML files for this.
Caution
If you don’t install a third party service mesh (like Linkerd), you will not be able to scale the number of replicas per Executor to more than one. A single replica will always handle all the traffic. No matter how many replicas will be running.
Deploy your Flow#
To deploy a Flow on Kubernetes
, first, you have to generate Kubernetes YAML configuration files from a Jina Flow.
Then, you can use the kubectl apply
command to create or update your Flow resources within your cluster.
Caution
All Executors in the Flow should be used with jinahub+docker://...
or docker://...
.
To generate YAML configurations for Kubernetes from a Jina Flow, one just needs to call:
flow.to_k8s_yaml('flow_k8s_configuration')
This will create a folder ‘flow_k8s_configuration’ with a set of Kubernetes YAML configurations for all the deployments composing the Flow
Hint
You can use a custom Docker image for the Gateway deployment. Just set the envrironment variable JINA_GATEWAY_IMAGE
to the desired image before generating the configuration.
Caution
The default deployment configurations generated by the Flow contain no special configuration objects. You may want to adapt it to your own needs. For instance, no Persistent Volume Object is added.
Scaling Executors on Kubernetes#
In Jina we support two ways of scaling:
Replicas can be used with any Executor type and is typically used for performance and availability.
Shards are used for partitioning data and should only be used with indexers since they store state.
Check here for more information.
Jina creates a separate Deployment in Kubernetes per Shard and uses Kubernetes native replica scaling to create multiple Replicas of a Deployment.
Once the Flow is deployed on Kubernetes, you can use all the native Kubernetes tools like kubeclt
to perform operations on the Pods and Deployments.
You can use this to add or remove replicas, to run rolling update operations, etc …
Scaling the Gateway#
The Gateway is responsible for providing the API of the Flow. If you have a large Flow with many Clients and many replicated Executors, the Gateway can become the bottleneck. In this case you can also scale up the Gateway deployment to be backed by multiple Kubernetes Pods. This is done by the regular means of Kubernetes: Either increase the number of replicas in the generated yaml configuration files or add replicas while running. To expose your Gateway replicas outside Kubernetes, you can add a load balancer as described here.
Extra Kubernetes options#
One could see that you can’t add basic Kubernetes feature like Secrets
, ConfigMap
or Lables
via the pythonic interface. That is intended
and it does not mean that we don’t support these features. On the contrary we allow you to fully express your Kubernetes configuration by using the Kubernetes API so that you can add you own Kubernetes standard to jina.
Hint
We recommend dumping the Kubernetes configuration files and then editing the files to suit your needs.
Here are possible configuration options you may need to add or change
Add labels
selector
s to the Deployments to suit your caseAdd
requests
andlimits
for the resources of the different PodsSetup persistent volume storage to save your data on disk
Pass custom configuration to your Executor with
ConfigMap
Manage the credentials of your Executor with secrets
Edit the default rolling update configuration
Example : Deploying a Flow with Kubernetes#
Preliminaries#
Set up a Kubernetes cluster and configure cluster access locally.
Tip
For local testing minikube
is recommended.
See also
Here are some managed Kubernetes
cluster solutions you could use:
Indexing and searching images using CLIP image encoder and PQLiteIndexer#
This example shows how to build and deploy a Flow in Kubernetes with CLIPImageEncoder
as encoder and PQLiteIndexer
as indexer.
from jina import Flow
f = (
Flow(port=8080)
.add(name='encoder', uses='jinahub+docker://CLIPEncoder', replicas=2)
.add(
name='indexer',
uses='jinahub+docker://PQLiteIndexer',
uses_with={'dim': 512},
shards=2,
)
)
Now, we can generate Kubernetes YAML configs from the Flow:
f.to_k8s_yaml('./k8s_flow', k8s_namespace='custom-namespace')
You should expect the following file structure generated:
.
└── k8s_flow
├── gateway
│ └── gateway.yml
└── encoder
│ ├── encoder.yml
│ └── encoder-head.yml
└── indexer
├── indexer-0.yml
├── indexer-1.yml
└── indexer-head.yml
You may need to edit these files to add your custom configuration
As you can see, the Flow contains configuration for the gateway and the rest of executors
Let’s create a Kubernetes namespace for our Flow:
kubectl create namespace custom-namespace
Now, you can deploy this Flow to you cluster in the following way:
kubectl apply -R -f ./k8s_flow
We can check that the pods were created:
kubectl get pods -n custom-namespace
NAME READY STATUS RESTARTS AGE
encoder-8b5575cb9-bh2x8 1/1 Running 0 60m
encoder-8b5575cb9-gx78g 1/1 Running 0 60m
encoder-head-55bbb477ff-p2bmk 1/1 Running 0 60m
gateway-7df8765bd9-xf5tf 1/1 Running 0 60m
indexer-0-8f676fc9d-4fh52 1/1 Running 0 60m
indexer-1-55b6cc9dd8-gtpf6 1/1 Running 0 60m
indexer-head-6fcc679d95-8mrm6 1/1 Running 0 60m
Note that the Jina gateway was deployed with name gateway-7df8765bd9-xf5tf
.
Once we see that all the Deployments in the Flow are ready, we can start indexing documents.
import portforward
from jina.clients import Client
from docarray import DocumentArray
with portforward.forward('custom-namespace', 'gateway-7df8765bd9-xf5tf', 8080, 8080):
client = Client(host='localhost', port=8080)
client.show_progress = True
docs = client.post(
'/index',
inputs=DocumentArray.from_files('./imgs/*.png').apply(
lambda d: d.convert_uri_to_datauri()
),
)
print(f' Indexed documents: {len(docs)}')
Exposing your Flow#
The previous examples use port-forwarding to index documents to the Flow. Thinking about real world applications, you might want to expose your service to make it reachable by the users, so that you can serve search requests
Caution
Exposing your Flow only works if the environment of your Kubernetes cluster
supports External Loadbalancers
.
Once the Flow is deployed, you can expose a service.
kubectl expose deployment gateway --name=gateway-exposed --type LoadBalancer --port 80 --target-port 8080 -n custom-namespace
sleep 60 # wait until the external ip is configured
Export the external ip which is needed for the client in the next section when sending documents to the Flow.
export EXTERNAL_IP=`kubectl get service gateway-exposed -n custom-namespace -o=jsonpath='{.status.loadBalancer.ingress[0].ip}'`
Client#
The client sends an image to the exposed Flow
on $EXTERNAL_IP
and retrieves the matches retrieved from the Flow.
Finally, it prints the uri of the closest matches.
import os
from jina.clients import Client
from docarray import DocumentArray
host = os.environ['EXTERNAL_IP']
port = 80
client = Client(host=host, port=port)
client.show_progress = True
docs = DocumentArray.from_files("./imgs/*.png").apply(
lambda d: d.convert_uri_to_datauri()
)
queried_docs = client.post("/search", inputs=docs)
matches = queried_docs[0].matches
print(f"Matched documents: {len(matches)}")