How to run Jina with Docker Compose#
Jina is a cloud-native neural search framework. Therefore, one of the simplest ways of either prototyping or serving in
production is to run your Flow
with docker-compose
.
A Flow
is composed of Executors
which run Python code
defined to operate on DocumentArray
. These Executors
will live in different runtimes depending on how you want to deploy
your Flow.
By default, if you are serving your Flow locally they will live within processes. Nevertheless, because Jina is cloud native your Flow can easily manage Executors that live in containers and that are orchestrated by your favorite tools. One of the simplest is Docker Compose which is supported out of the box.
You can use this with one line:
flow.to_docker_compose_yaml('docker-compose.yml')
Jina will generate a docker-compose.yml
configuration file that you can use directly with
docker-compose
and corresponds to your Flow
, avoiding the overhead of manually defining all the services needed for the Flow
.
Caution
All Executors in the Flow should be used with jinahub+docker://...
or docker://...
.
Caution
If you are using Executor which rely on docker image built with a jina version prior to 3.1.3, please remove the health check from the dump yaml file as they are only compatible with 3.1.3+ otherwise your docker compose services will always be `unhealthy
Example: Indexing and searching images using CLIP image encoder and PQLiteIndexer#
Deploy your Flow#
Caution
First ensure Docker Compose
is installed locally.
Caution
Before starting this example, make sure that CLIPImageEncoder and PQLiteIndexer images are already pulled to your local machine.
You can use:
jina hub pull jinahub+docker://CLIPImageEncoder
jina hub pull jinahub+docker://PQLiteIndexer
This example shows how to build and deploy a Flow with Docker Compose, using CLIPImageEncoder
as an image encoder and PQLiteIndexer
as an indexer to perform fast nearest
neighbor retrieval on image embeddings.
from jina import Flow
f = (
Flow(port=8080, protocol='http')
.add(name='encoder', uses='jinahub+docker://CLIPImageEncoder', replicas=2)
.add(
name='indexer',
uses='jinahub+docker://PQLiteIndexer',
uses_with={'dim': 512},
shards=2,
)
)
Now, we can generate Docker Compose YAML configuration from the Flow:
f.to_docker_compose_yaml('docker-compose.yml')
Hint
You can use a custom Docker image for the Gateway service. Just set the envrironment variable JINA_GATEWAY_IMAGE
to the desired image before generating the configuration.
let’s take a look at the generated compose file:
version: '3.3'
...
services:
encoder-head: # # # # # # # # # # #
# #
encoder-rep-0: # Encoder #
# #
encoder-rep-1: # # # # # # # # # # #
indexer-head: # # # # # # # # # # #
# #
indexer-0: # Indexer #
# #
indexer-1: # # # # # # # # # # #
gateway:
...
ports:
- 8080:8080
Caution
The default compose file generated by the Flow contains no special configuration or settings. You may want to adapt it to your own needs.
Here you can see that 7 services will be created:
1 for the
gateway
which is the entrypoint of theFlow
.3 associated with the encoder: one for the Head and two for the Replicas.
3 associated with the indexer, one for the Head and two for the Shards.
Now, you can deploy this Flow to your cluster:
docker-compose -f docker-compose.yml up
Use your search engine and query your Flow#
Now that your Flow is up and running in your docker-compose you can to query it:
Once we see that all the services in the Flow are ready, we can start sending index and search requests.
First let’s define a client:
from jina.clients import Client
client = Client(host='localhost', protocol='http', port=8080)
client.show_progress = True
Then let’s index the set of images we want to search:
Caution
Before using your Flow, ensure you have several .jpg
images in the ./imgs
folder.
from docarray import DocumentArray
indexing_documents = DocumentArray.from_files('./imgs/*.jpg').apply(
lambda d: d.load_uri_to_image_tensor()
)
indexed_docs = client.post('/index', inputs=indexing_documents)
print(f'Indexed Documents: {len(indexed_docs)}')
Then let’s search for the closest image to our query image:
query_doc = indexing_documents[0]
queried_docs = client.post("/search", inputs=[query_doc])
matches = queried_docs[0].matches
print(f'Matched documents: {len(matches)}')