How to Add New Executors

Note

This guide assumes you have a basic understanding of Jina, if you haven’t, please check out Jina 101 first.

Motivation

As a Jina user, you might already noticed that Jina-hub is the open-registry for hosting Jina executors. These Executors has been categorised into folders by their types, such as Encoder, Ranker, Crafter etc.

However, when the existing Executors does not fit your specific use case, you might be curious on how to extend Jina. For example, integrate a new deep learning model, add a new indexing algorithm, or create your own evaluation metric.

In this tutorial, we’ll guide you through the steps. First of all, we will introduce the general steps on how to customize a Jina Executor. At the end, we will give a concrete example of integrating the text encoder & image encoder on top of OpenAI’s latest published CLIP model.

Overview

To make an extension of a Jina Executor, please follow the steps listed below:

  1. Decide which Executor class to inherit from.

  2. Override __init__() and post_init().

  3. Overwrite the Core method of the Executor.

  4. (Optional) Implement the save logic.

Implementation

Decide which Executor class to inherit from

When adding a customised Executor, the first thing is to inherit the “correct” class based on the use case. The built-in Executor types are:

  1. Encoder: encode document as vector embeddings.

  2. Indexer: save and retrieve vectors and key-value pairs from storage.

  3. Crafter: transform the content of documents.

  4. Segmenter: segment document into smaller pieces of documents.

  5. Ranker: calculate scores of documents.

  6. Classifier: enrich document with a model.

  7. Evaluator: evaluate score based on output and GroundTruth.

  8. CompoundExecutor: combine multiple executors into one.

Rule of thumb, you always pick the executor that shares the similar logic to inherit.

Note

If your algorithm is so unique and does not fit any any of the category above, you may want to submit an issue for discussion before you start.

Built-in Executors to Inherit

Name

Base Class

Description

BaseEncoder

BaseExecutor

Represent the documents as vector embeddings.

BaseNumericEncoder

BaseEncoder

Represent numpy array object (e.g. image, video, audio) as vector embeddings.

BaseTextEncoder

BaseEncoder

Represent string object as vector embeddings.

BaseMultimodalEncoder

BaseExecutor

Encode input from different modalities.

BaseIndexer

BaseExecutor

Save and retrieve vectors and key-value pairs from storage.

BaseVectorIndexer

BaseIndexer

Save and retrieve vectors from storage.

NumpyIndexer

BaseVectorIndexer

Use numpy array for storage.

BaseKVIndexer

BaseIndexer

Save and retrieve key-value pairs from storage.

BaseCrafter

BaseExecutor

Transform the content of Documents.

BaseSegmenter

BaseExecutor

Segment Document into small pieces of Document.

BaseRanker

BaseExecutor

Calculate scores of Documents.

Chunk2DocRanker

BaseRanker

Translates the chunk-wise score (distance) to the doc-wise score.

Match2DocRanker

BaseRanker

Re-scores the matches for a document.

BaseClassifier

BaseExecutor

Enrich the documents with a classifier.

BaseEvaluator

BaseExecutor

Evaluate score based on output and GroundTruth.

CompoundExecutor

BaseExecutor

Combine multiple executors in one.

Override __init__() and post_init()

You can put simple type attributes that define the behavior of your Executor into __init__(). Simple types represent all pickle-able types, including: integer, bool, string, tuple of simple types, list of simple types, map of simple type. For example,

from jina.executors.crafters import BaseSegmenter

class GifPreprocessor(BaseSegmenter):
  def __init__(self, img_shape: int = 96, every_k_frame: int = 1, max_frame: int = None, from_bytes: bool = False, *args, **kwargs):
      super().__init__(*args, **kwargs)
      self.img_shape = img_shape
      self.every_k_frame = every_k_frame
      self.max_frame = max_frame
      self.from_bytes = from_bytes

Remember to add super().__init__(*args, **kwargs) to your __init__(). Only in this way you can enjoy many magic features, e.g. YAML support, persistence from the base class (and BaseExecutor).

Note

All attributes declared in __init__() will be persisted during save() and load().

What if the data you need to load cannot be stored in a simple type? For example, a deep learning graph, a big pretrained model, a gRPC stub, a tensorflow session, a thread? The you can put them into post_init().

It is also interesting to override post_init() when there is a better persistence method other than pickle. For example, your hyperparameters matrix in numpy ndarray is certainly pickable. However, you can simply read and write it via standard file IO, and it is likely more efficient than pickle. In this case, you do the data loading in post_init().

Please check the example below:

from jina.executors.encoders import BaseTextEncoder

class TextPaddlehubEncoder(BaseTextEncoder):

    def __init__(self,
                 model_name: str = 'ernie_tiny',
                 max_length: int = 128,
                 *args,
                 **kwargs):
        super().__init__(*args, **kwargs)
        self.model_name = model_name
        self.max_length = max_length


    def post_init(self):
        import paddlehub as hub
        self.model = hub.Module(name=self.model_name)
        self.model.MAX_SEQ_LEN = self.max_length

Note

post_init() is also a good place to introduce package dependency, e.g. import x or from x import y. Naively, you can always put all imports upfront at the top of the file. However, this will throw an ModuleNotFound exception when this package is not installed locally. Sometimes it may break the whole system because of this one missing dependency.

As a rule of thumb, only import packages where you really need them. Often these dependencies are only required in post_init() and the core method, which we shall see later.

Override the core method of the base class

Each Executor has a core method, which defines the algorithmic behavior of the Executor. For making your own extension, you have to override the core method. The following table lists the core method you may want to override. Note some executors may have multiple core methods.

Base class

Core method(s)

BaseEncoder

encode()

BaseCrafter

craft()

BaseSegmenter

segment()

BaseIndexer

add(), query()

BaseRanker

score()

BaseClassifier

predict()

BaseEvaluator

evaluate()

Feel free to override other methods/properties as you need. But probably, most of the extension can be done by simply overriding the core methods listed above.

Implement the persistence logic

If you don’t override post_init(), then you don’t need to implement persistence logic. You get YAML and persistency support off-the-shelf because of BaseExecutor. Simple crafters and rankers fall into this category.

If you override post_init() but you don’t care about persisting its state in the next run (when the executor process is restarted); or the state is simply unchanged during the run, then you don’t need to implement persistence logic. Loading from a fixed pretrained deep learning model falls into this category.

Persistence logic is only required when you implement customized loading logic in :meth:`post_init` and the state is changed during the run. Then you need to override __getstate__(). Many of the indexers fall into this category.

In the example below, the tokenizer is loaded in post_init() and saved in __getstate__(), whcih completes the persistency cycle.

class CustomizedEncoder(BaseEncoder):

    def post_init(self):
        self.tokenizer = tokenizer_dict[self.model_name].from_pretrained(self._tmp_model_path)
        self.tokenizer.padding_side = 'right'

    def __getstate__(self):
        self.tokenizer.save_pretrained(self.model_abspath)
        return super().__getstate__()

How Can I Use My Extension

You can use the extension by specifying py_modules in the YAML file. For example, your extension Python file is called my_encoder.py, which describes MyEncoder. Then you can define a YAML file (say my.yml) as follows:

!MyEncoder
with:
  greetings: hello im external encoder
metas:
  py_modules: my_encoder.py

Note

You can also assign a list of files to metas.py_modules if your Python logic is splitted over multiple files. This YAML file and all Python extension files should be put under the same directory.

Then simply use it in Jina CLI by specifying jina pod --uses=my.yml, or Flow().add(uses='my.yml') in Flow API.

Warning

If you use customized executor inside a jina.executors.CompoundExecutor, then you only need to set metas.py_modules at the root level, not at the sub-component level.

Customize Executor in Action: CLIP Encoder

CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. It can be instructed in natural language to predict the most relevant text snippet given an image.

The pre-trained CLIP model is able to transform both images and text into the same latent space, where image and text emebddings can be compared using a similarity measure. We will use CLIP as an example to see how to create Encoder powered by CLIP model, for text-to-image search. You can refer to our cross model search to find the example.

Since CLIP maps image and text into a common latent space, it’s objective is to represent documents as vector embeddings. So we need to inherit from BaseEncoder class. To encode a piece of text using CLIP, we might create a CLIPTextEncoder and inherit from BaseTextEncoder. To encoder an image using CLIP, we might create a CLIPImageEncoder and inherit from BaseNumericEncoder.

The next step is to override __init__() and post_init(). For __init__(), we could specify a new parameter called model_name since CLIP has 2 pre-trained models, i.e. ResNet50 and ViT-B/32. As was mentioned before, it is a good practice to load pre-trained model inside post_init(), now we have an Encoder like this:

class CLIPTextEncoder(BaseTextEncoder):
    """Encode text into vector embeddings powered by OpenAI's CLIP model."""

    def __init__(
        self,
        model_name: str ='ViT-B/32',
        *args, **kwargs
    ):
        super().__init__(*args, **kwargs)
        self.model_name = model_name

    def post_init(self):
        """Load pre-trained CLIP model."""
        import clip
        model, _ = clip.load(self.model_name, self.device)
        self.model = model

    # the rest of the code

At the end, we need to overwrite the core method of the Executor. Since it is an Encoder, we need to overwrite the encode().

class CLIPTextEncoder(BaseTextEncoder):
    """Encode text into vector embeddings powered by OpenAI's CLIP model."""

    def __init__(
        self,
        model_name: str ='ViT-B/32',
        *args, **kwargs
    ):
        super().__init__(*args, **kwargs)
        self.model_name = model_name

    def post_init(self):
        """Load pre-trained CLIP model."""
        import clip
        model, _ = clip.load(self.model_name, self.device)
        self.model = model

    def encode(self, data: 'np.ndarray', *args, **kwargs) -> 'np.ndarray':
        tensor = clip.tokenize(data)
        with torch.no_grad():
            encoded_data = self.model.encode_text(tensor)
        return encoded_data.cpu().numpy()

In the code sample above, we called CLIP’s encode_text() to use the pre-trained CLIP model and encode input data into vector embeddings.

Note

The example above is a minimum working example of a CLIPTextEncoder, for full features such as GPU support, batching and dockerization, please check out Jina-hub.

The same applies to CLIPImageEncoder, the only difference is to use self.model.encode_image() in encode(). Last but not least, create the YAML configuration for the encoder and use it with Jina CLI or Flow API.

!CLIPTextEncoder
metas:
  py_modules:
    - __init__.py

Then use it in Jina CLI by specifying jina pod --uses=config.yml, or Flow().add(uses='config.yml') in Flow API. And you have a good foundation to build your index/query Flow powered by CLIP.

Share Your Work!

If you would like to share your customized Executor with the community, more than welcome! We use cookiecutter to create Jina Executor from the template.

Note

Install Docker and run pip install “jina[devel]” before you start.

To make sure your work has a good shape, Jina provides a wizard to help you create a Executor, start it with jina hub new –type pod. It will generate a standard Executor project like this:

CLIPTextEncoder/
├── Dockerfile
├── manifest.yml
├── README.md
├── config.yml
├── requirements.txt
├── __init__.py
└── tests/
    ├── test_CLIPTextEncoder.py
    └── __init__.py

And you can put your customized Encoder, such as CLIPTextEncoder inside __init__.py. The YAML configurations should be placed in config.yml.

To ensure your customised Executor, such as CLIPTextEncoder performs exactly the same as the original CLIP model, please add tests inside tests folder. For example, encode some text data with the raw CLIP model, and assert we get the same result with CLIPTextEncoder.

Please build and test your Encoder locally with:

jina hub build -t jinahub/type.kind.jina-image-name:image_version-jina_version <your_folder>

Once tested, you should login to jina hub with jina hub login, copy/paste the token into GitHub to verify your account. Now you are able to push your work to jina hub:

jina hub push jinahub/type.kind.jina-image-name:image-jina_version

In our example, the type is pod, kind is encoders and jina-image-name is cliptextencoder and clipimageencoder.

What’s next

Thanks for your time and effort while reading this guide!

Please check out Jina-Hub to explore the executors. If you still have questions, feel free to submit an issue or post a message in our community slack channel .

To gain a deeper knowledge on the implementation of Jina Executors, you can find the source code here.