Jina YAML Syntax Reference

Jina configurations use YAML syntax, and must have either a .yml or .yaml file extension. If you’re new to YAML and want to learn more, see Learn YAML in five minutes.

Executor YAML Syntax

All executors defined in jina.executors can be loaded from a YAML config via jina.executors.BaseExecutor.load_config() or via the CLI jina pod --exec_yaml-path.

The executor YAML config follows the syntax below.

!BasePbIndexer
with:
  index_filename: doc.gzip
metas:  # <- metas defined in :mod``
  name: doc_indexer  # a customized name
  workspace: $TEST_WORKDIR
!SomeExecutorClass

The class of the executor, can be any class inherited from jina.executors.BaseExecutor. Note that it must starts with ! to tell the YAML parser that the section below is describing this class.

with

A list of arguments in the __init__() function of this executor. One can use environment variables here to expand the variables.

metas

A list of meta arguments defined in jina.executors.metas.

If an executor has no __init__() or __init__() requires no arguments, then one do not need to write with at all.

In the minimum case, if you don’t want to specify any with and metas, you can simply write:

# encoder.yml
!AwesomeExecutor

Or even not using this YAML but simply write:

import jina.executors.BaseExecutor

a = BaseExecutor.load_config('AwesomeExecutor')

CompoundExecutor YAML Syntax

A compound executor is a set of executors bundled together, as defined in jina.executors.compound. It follows the syntax above with an additional feature: routing.

!CompoundExecutor
components:
- !NumpyIndexer
  with:
    num_dim: -1
    index_key: HNSW32
    index_filename: vec.idx
  metas:
    name: my_vec_indexer
- !BasePbIndexer
  with:
    index_filename: chunk.gzip
  metas:
    name: chunk_meta_indexer
with:
  routes:
    meta_add:
      chunk_meta_indexer: add
    meta_query:
      chunk_meta_indexer: query
    query:
      my_vec_indexer: query
    add:
      my_vec_indexer: add
metas:
  name: chunk_compound_indexer
  workspace: $TEST_WORKDIR
components

A list of executors specified. Note that metas.name must be specified if you want to later quote this executor in with.routes.

with
routes
A:
    B: C

It defines a function mapping so that a new function A() is created for this compound executor and points to B.C(). Note that B must be a valid name defined in components.metas.name

Referencing Variables in Executor and CompoundExecutor YAML

In the YAML config, one can reference environment variables with $ENV, or using {path.variable} to reference the variable defined inside the YAML. For example,

components:
  - with:
      index_filename: metaproto
    metas:
      name: test_meta
      good_var:
        - 1
        - 2
      bad_var: '{root.metas.name}'
  - with:
      index_filename: npidx
    metas:
      name: test_numpy
      bad_var: '{root.components[0].metas.good_var[1]}'  # expand to the string 'real-compound'
      float_var: '{root.float.val}'  # expand to the float 0.232
      mixed: '{root.float.val}-{root.components[0].metas.good_var[1]}-{root.metas.name}'  # expand to the string '0.232-2-real-compound'
      mixed_env: '{root.float.val}-$ENV1'  # expand to the string '0.232-a'
      name_shortcut: '{this.name}'  # expand to the string 'test_nunpy'
metas:
  name: real-compound
rootvar: 123
float:
  val: 0.232
root.var

Referring to the top-level variable defined in the root.

this.var

Referring to the same-level variable.

Note

One must quote the string when using referenced values, i.e. '{root.metas.name}' but not {root.metas.name}.

Driver YAML Sytanx

jina.drivers.Driver connects jina.peapods.pea.BasePod and jina.executors. A driver map is a collection of driver groups which can be referred by the BasePod via CLI (jina pod --driver_yaml-path --driver-group).

# this YAML files is a "Driver Map"
drivers:
  encode:  # <== this is a "Driver Group"
    handlers:
      /:
        - handler_encode_doc: encode   # this is a "Driver" attached to a Executor function

  segment:
    handlers:
      /:
        - handler_segment: craft

  index-chunk-and-meta:
    handlers:
      QueryRequest:
        - handler_chunk_search: query
        - handler_meta_search_chunk: meta_query
      IndexRequest:
        - handler_chunk_index: add
        - handler_prune_chunk
        - handler_meta_index_chunk: meta_add
drivers

A map of the driver group to the handlers, the name can be referred in jina pod --driver-group

handlers

A map of request types to a list of handlers

request_type:
    - handler: executor_func
request_type

Possible values are QueryRequest, IndexRequest, TrainRequest and / representing all requests.

handler

All handler functions defined in jina.drivers.handlers

(optional) executor_func

If the handler is paired with certain executor function, then here should be the name of it

Flow YAML Sytanx

jina.flow.Flow can be loaded from a YAML config file. It follows the following syntax as the example below:

!Flow
with:
  sse_logger: true
pods:
  chunk_seg:
    driver_group: segment
    replicas: 3
  encode1:
    driver_group: index-meta-doc
    replicas: 2
    needs: chunk_seg
  encode2:
    driver_group: index-meta-doc
    replicas: 2
    needs: chunk_seg
  join_all:
    needs: [encode1, encode2]

A valid Flow specification starts with !Flow as the first line.

with

A list of arguments in the jina.flow.Flow.__init__() function

pods

A map of jina.peapods.pod.BasePod contained in the flow. The key is the name of this pod and the value is a map of arguments accepted by jina pod. One can refer needs to a pod by its name.

The flows given by the following Python code and the YAML config are identical.

f = (Flow(driver_yaml_path='my-driver.yml')
     .add(name='chunk_seg', driver_group='segment',
          exec_yaml_path='preprocess/gif2chunk.yml',
          replicas=3)
     .add(name='doc_idx', driver_group='index-meta-doc',
          exec_yaml_path='index/doc.yml')
     .add(name='tf_encode', driver_group='encode',
          exec_yaml_path='encode/encode.yml',
          replicas=3, needs='chunk_seg')
     .add(name='chunk_idx', driver_group='index-chunk-and-meta',
          exec_yaml_path='index/npvec.yml')
     .join(['doc_idx', 'chunk_idx'])
     )
!Flow  # my-flow.yml
with:
  driver_yaml_path: my-driver.yml
pods:
  chunk_seg:
    driver_group: segment
    exec_yaml_path: preprocess/gif2chunk.yml
    replicas: 3
  doc_idx:
    driver_group: index-meta-doc
    exec_yaml_path: index/doc.yml
  tf_encode:
    driver_group: encode
    exec_yaml_path: encode/encode.yml
    needs: chunk_seg
    replicas: 3
  chunk_idx:
    driver_group: index-chunk-and-meta
    exec_yaml_path: index/npvec.yml
  join_all:
    driver_group: merge
    needs: [doc_idx, chunk_idx]
from jina.flow import Flow
g = Flow.load_config('my-flow.yml')

assert(f==g)  # return True

Note that you can replace the value of replicas with an environment variables $REPLICAS in the YAML and it will be expanded during load_config().