How to run Jina with Docker Compose#

Jina is a cloud-native neural search framework. Therefore, one of the simplest ways of either prototyping or serving in production is to run your Flow with docker-compose.

A Flow is composed of Executors which run Python code defined to operate on DocumentArray. These Executors will live in different runtimes depending on how you want to deploy your Flow.

By default, if you are serving your Flow locally they will live within processes. Nevertheless, because Jina is cloud native your Flow can easily manage Executors that live in containers and that are orchestrated by your favorite tools. One of the simplest is Docker Compose which is supported out of the box.

You can use this with one line:

flow.to_docker_compose_yaml('docker-compose.yml')

Jina will generate a docker-compose.yml configuration file that you can use directly with docker-compose and corresponds to your Flow, avoiding the overhead of manually defining all the services needed for the Flow.

Caution

All Executors in the Flow should be used with jinahub+docker://... or docker://....

Caution

If you are using Executor which rely on docker image built with a jina version prior to 3.1.3, please remove the health check from the dump yaml file as they are only compatible with 3.1.3+ otherwise your docker compose services will always be `unhealthy

Example: Indexing and searching images using CLIP image encoder and PQLiteIndexer#

Deploy your Flow#

Caution

First ensure Docker Compose is installed locally.

Caution

Before starting this example, make sure that CLIPImageEncoder and PQLiteIndexer images are already pulled to your local machine.

You can use:

jina hub pull jinahub+docker://CLIPImageEncoder jina hub pull jinahub+docker://PQLiteIndexer

This example shows how to build and deploy a Flow with Docker Compose, using CLIPImageEncoder as an image encoder and PQLiteIndexer as an indexer to perform fast nearest neighbor retrieval on image embeddings.

from jina import Flow

f = (
    Flow(port=8080, protocol='http')
    .add(name='encoder', uses='jinahub+docker://CLIPImageEncoder', replicas=2)
    .add(
        name='indexer',
        uses='jinahub+docker://PQLiteIndexer',
        uses_with={'dim': 512},
        shards=2,
    )
)

Now, we can generate Docker Compose YAML configuration from the Flow:

f.to_docker_compose_yaml('docker-compose.yml')

Hint

You can use a custom Docker image for the Gateway service. Just set the envrironment variable JINA_GATEWAY_IMAGE to the desired image before generating the configuration.

let’s take a look at the generated compose file:

version: '3.3'
...
services:
  encoder-head:   # # # # # # # # # # # 
                  #                   #   
  encoder-rep-0:  #   Encoder         #
                  #                   #
  encoder-rep-1:  # # # # # # # # # # #

  indexer-head:   # # # # # # # # # # # 
                  #                   #   
  indexer-0:      #   Indexer         #
                  #                   #
  indexer-1:      # # # # # # # # # # #

  gateway: 
    ...
    ports:
    - 8080:8080

Caution

The default compose file generated by the Flow contains no special configuration or settings. You may want to adapt it to your own needs.

Here you can see that 7 services will be created:

  • 1 for the gateway which is the entrypoint of the Flow.

  • 3 associated with the encoder: one for the Head and two for the Replicas.

  • 3 associated with the indexer, one for the Head and two for the Shards.

Now, you can deploy this Flow to your cluster:

docker-compose -f docker-compose.yml up

Use your search engine and query your Flow#

Now that your Flow is up and running in your docker-compose you can to query it:

Once we see that all the services in the Flow are ready, we can start sending index and search requests.

First let’s define a client:

from jina.clients import Client

client = Client(host='localhost', protocol='http', port=8080)
client.show_progress = True

Then let’s index the set of images we want to search:

Caution

Before using your Flow, ensure you have several .jpg images in the ./imgs folder.

from docarray import DocumentArray

indexing_documents = DocumentArray.from_files('./imgs/*.jpg').apply(
    lambda d: d.load_uri_to_image_tensor()
)

indexed_docs = client.post('/index', inputs=indexing_documents)

print(f'Indexed Documents: {len(indexed_docs)}')

Then let’s search for the closest image to our query image:

query_doc = indexing_documents[0]
queried_docs = client.post("/search", inputs=[query_doc])

matches = queried_docs[0].matches
print(f'Matched documents: {len(matches)}')