Deploy on Kubernetes#
This how-to will go through deploying a simple Flow using Kubernetes, customizing the Kubernetes configuration to your needs, and scaling Executors using replicas and shards.
Deploying a Flow
in Kubernetes is the recommended way to use Jina in production.
Since a Flow
is composed of Executor
s which can run in different runtimes depending on how you deploy
the Flow, Kubernetes can easily take over the lifetime management of Executors.
Hint
This guide is designed for users who want to manually deploy a Jina project on Kubernetes.
Check out Jina AI Cloud Hosting if you want a one-click solution to deploy and host Jina, leveraging a cloud-native stack of Kubernetes, Prometheus and Grafana, without worrying about provisioning.
Preliminaries#
To follow this how-to, you need access to a Kubernetes cluster.
You can either set up minikube
, or use one of many managed Kubernetes
solutions in the cloud:
You need to install Linkerd in your K8s cluster. To use Linkerd, install the Linkerd CLI and its control plane in your cluster. This automatically sets up and manages the service mesh proxies when you deploy the Flow.
To understand why you need to install a service mesh like Linkerd refer to this section
Deploy a simple Flow#
By simple in this context we mean a Flow without replicated or sharded Executors - you can see how to use those in Kubernetes later on.
For now, define a Flow, either in YAML or directly in Python, as we do here:
from jina import Flow
f = (
Flow(port=8080)
.add(name='encoder', uses='jinaai+docker://jina-ai/CLIPEncoder')
.add(
name='indexer',
uses='jinaai+docker://jina-ai/AnnLiteIndexer',
uses_with={'dim': 512},
)
)
You can essentially define any Flow of your liking. Just ensure that all Executors are containerized, either by using ‘jinahub+docker’, or by containerizing your local Executors.
The example Flow here simply encodes and indexes text or image data using two Executors from Executor Hub.
Next, generate Kubernetes YAML configs from the Flow. It’s good practice to define a new Kubernetes namespace for that purpose:
f.to_kubernetes_yaml('./k8s_flow', k8s_namespace='custom-namespace')
The following file structure will be generated - don’t worry if it’s slightly different – there can be changes from one Jina version to another:
.
└── k8s_flow
├── gateway
│ └── gateway.yml
└── encoder
│ └── encoder.yml
└── indexer
└── indexer.yml
You can inspect these files to see how Flow concepts are mapped to Kubernetes entities. And as always, feel free to modify these files as you see fit for your use case.
Caution: Executor YAML configurations
As a general rule, the configuration files produced by to_kubernets_yaml()
should run out of the box, and if you strictly
follow this how-to they will.
However, there is an exception to this: If you use a local dockerized Executor, and this Executors configuration is stored
in a file other than config.yaml
, you will have to adapt this Executor’s Kubernetes YAML.
To do this, open the file and replace config.yaml
with the actual path to the Executor configuration.
This is because when a Flow contains a Docker image, it can’t see what Executor
configuration was used to create that image.
Since all of our tutorials use config.yaml
for that purpose, the Flow uses this as a best guess.
Please adapt this if you named your Executor configuration file differently.
Next you can actually apply these configuration files to your cluster, using kubectl
.
This launches all Flow microservices.
First, create the namespace you defined earlier:
kubectl create namespace custom-namespace
Now, deploy this Flow to your cluster:
kubectl apply -R -f ./k8s_flow
Check that the Pods were created:
kubectl get pods -n custom-namespace
NAME READY STATUS RESTARTS AGE
encoder-8b5575cb9-bh2x8 1/1 Running 0 60m
gateway-7df8765bd9-xf5tf 1/1 Running 0 60m
indexer-8f676fc9d-4fh52 1/1 Running 0 60m
indexer-head-6fcc679d95-8mrm6 1/1 Running 0 60m
Note that the Jina gateway was deployed with name gateway-7df8765bd9-xf5tf
.
Once you see that all the Deployments in the Flow are ready, you can start indexing documents:
import portforward
from jina.clients import Client
from docarray import DocumentArray
with portforward.forward('custom-namespace', 'gateway-7df8765bd9-xf5tf', 8080, 8080):
client = Client(host='localhost', port=8080)
client.show_progress = True
docs = client.post(
'/index',
inputs=DocumentArray.from_files('./imgs/*.png').apply(
lambda d: d.convert_uri_to_datauri()
),
)
print(f' Indexed documents: {len(docs)}')
Deploy Flow with shards and replicas#
After your service mesh is installed, your cluster is ready to run a Flow with scaled Executors. You can adapt the Flow from above to work with two replicas for the encoder, and two shards for the indexer:
from jina import Flow
f = (
Flow(port=8080)
.add(name='encoder', uses='jinaai+docker://jina-ai/CLIPEncoder', replicas=2)
.add(
name='indexer',
uses='jinaai+docker://jina-ai/AnnLiteIndexer',
uses_with={'dim': 512},
shards=2,
)
)
Again, you can generate your Kubernetes configuration:
f.to_kubernetes_yaml('./k8s_flow', k8s_namespace='custom-namespace')
Now you should see the following file structure:
.
└── k8s_flow
├── gateway
│ └── gateway.yml
└── encoder
│ └─ encoder.yml
└── indexer
├── indexer-0.yml
├── indexer-1.yml
└── indexer-head.yml
Apply your configuration like usual:
Hint: Cluster cleanup
If you already have the simple Flow from the first example running on your cluster, make sure to delete it using kubectl delete -R -f ./k8s_flow
.
kubectl apply -R -f ./k8s_flow
Deploy Flow with custom environment variables and secrets#
You can customize the environment variables that are available inside runtime, either defined directly or read from a Kubernetes secret:
from jina import Flow
f = (
Flow(port=8080)
.add(
name='indexer',
uses='jinaai+docker://jina-ai/AnnLiteIndexer',
uses_with={'dim': 512},
env={'k1': 'v1', 'k2': 'v2'},
env_from_secret={
'SECRET_USERNAME': {'name': 'mysecret', 'key': 'username'},
'SECRET_PASSWORD': {'name': 'mysecret', 'key': 'password'},
},
)
)
f.to_kubernetes_yaml('./k8s_flow', k8s_namespace='custom-namespace')
In a flow.yml
file :
jtype: Flow
version: '1'
with:
protocol: http
executors:
- name: indexer
uses: jinaai+docker://jina-ai/AnnLiteIndexer
uses_with:
dim: 512
env:
k1: v1
k2: v2
env_from_secret:
SECRET_USERNAME:
name: mysecret
key: username
SECRET_PASSWORD:
name: mysecret
key: password
You can generate Kubernetes YAML configs using jina export
:
jina export kubernetes flow.yml ./k8s_flow --k8s-namespace custom-namespace
After creating the namespace, you need to create the secrets mentioned above:
kubectl -n custom-namespace create secret generic mysecret --from-literal=username=jina --from-literal=password=123456
Then you can apply your configuration.
Exposing the Flow#
The previous examples use port-forwarding to index documents to the Flow. In real world applications, you may want to expose your service to make it reachable by users so that you can serve search requests.
Caution
Exposing the Flow only works if the environment of your Kubernetes cluster
supports External Loadbalancers
.
Once the Flow is deployed, you can expose a service:
kubectl expose deployment gateway --name=gateway-exposed --type LoadBalancer --port 80 --target-port 8080 -n custom-namespace
sleep 60 # wait until the external ip is configured
Export the external IP address. This is needed for the client when sending Documents to the Flow in the next section.
export EXTERNAL_IP=`kubectl get service gateway-exposed -n custom-namespace -o=jsonpath='{.status.loadBalancer.ingress[0].ip}'`
Client#
The client:
Sends an image to the exposed
Flow
on$EXTERNAL_IP
Retrieves the matches from the Flow.
Prints the uris of the closest matches.
You should configure your Client to connect to the Flow via the external IP address as follows:
import os
from jina.clients import Client
host = os.environ['EXTERNAL_IP']
port = 80
client = Client(host=host, port=port)
client.show_progress = True
docs = DocumentArray.from_files("./imgs/*.png").apply(
lambda d: d.convert_uri_to_datauri()
)
queried_docs = client.post("/search", inputs=docs)
matches = queried_docs[0].matches
print(f"Matched documents: {len(matches)}")
Update your Executor in Kubernetes#
In Kubernetes, you can update your Executors by patching the Deployment corresponding to your Executor.
For instance, in the example above, you can change the CLIPEncoder’s batch_size
parameter by changing the content of the Deployment inside the executor.yml
dumped by .to_kubernetes_yaml
.
You need to add --uses_with
and pass the batch size argument to it. This is passed to the container inside the Deployment:
spec:
containers:
- args:
- executor
- --name
- encoder
- --k8s-namespace
- custom-namespace
- --uses
- config.yml
- --port
- '8080'
- --uses-metas
- '{}'
- --uses-with
- '{"batch_size": 64}'
- --native
command:
- jina
After doing so, re-apply your configuration so the new Executor will be deployed without affecting the other unchanged Deployments:
kubectl apply -R -f ./k8s_flow
Other patching options
In Kubernetes Executors are ordinary Deployments, so you can use other patching options provided by Kubernetes:
kubectl replace
to replace an Executor using a complete configuration filekubectl patch
to patch an Executor using only a partial configuration filekubectl edit
to edit an Executor configuration on the fly in your editor
You can find more information about these commands in the official Kubernetes documentation.
Key takeaways#
In short, there are just three key steps to deploy a Jina Flow on Kubernetes:
Use
f.to_kubernetes_yaml()
to generate Kubernetes configuration files from a Jina Flow object.Apply the generated file via
kubectl
(Modify the generated files if necessary)Expose your Flow outside the K8s cluster