#
YAML specificationTo generate a YAML configuration from a Flow
Python object, use save_config()
.
YAML completion in IDE#
We provide a JSON Schema for your IDE to enable code completion, syntax validation, members listing and displaying help text.
PyCharm users#
Click menu
Preferences
->JSON Schema mappings
;Add a new schema, in the
Schema File or URL
writehttps://api.jina.ai/schemas/latest.json
; selectJSON Schema Version 7
;Add a file path pattern and link it to
*.jaml
or*.jina.yml
or any suffix you commonly used for Jina Flow’s YAML.
VSCode users#
Install the extension:
YAML Language Support by Red Hat
;In IDE-level
settings.json
add:
"yaml.schemas": {
"https://api.jina.ai/schemas/latest.json": ["/*.jina.yml", "/*.jaml"],
}
You can bind Schema to any file suffix you commonly used for Jina Flow’s YAML.
Example YAML#
jtype: Flow
version: '1'
with:
protocol: http
executors:
# inline Executor YAML
- name: firstexec
uses:
jtype: MyExec
py_modules:
- executor.py
# reference to Executor YAML
- name: secondexec
uses: indexer.yml
workspace: /home/my/workspace
# reference to Executor Python class
- name: thirdexec
uses: CustomExec # located in executor.py
Fields#
jtype
#
String that is always set to “Flow”, indicating the corresponding Python class.
version
#
String indicating the version of the Flow.
with
#
Keyword arguments are passed to a Flow’s __init__()
method. You can set Flow-specific arguments and Gateway-specific arguments here:
Flow arguments#
Name |
Description |
Type |
Default |
---|---|---|---|
|
The name of this object. |
|
|
|
The working directory for any IO operations in this object. If not set, then derive from its parent |
|
|
|
The YAML config of the logger used in this object. |
|
|
|
If set, then no log will be emitted from this object. |
|
|
|
If set, then exception stack information will not be added to the log |
|
|
|
The YAML path represents a flow. It can be either a local file path or a URL. |
|
|
|
If set, auto-reloading on file changes is enabled: the Flow will restart while blocked if YAML configuration source is changed. This also applies apply to underlying Executors, if their source code or YAML configuration has changed. |
|
|
|
The map of environment variables that are available inside runtime |
|
|
|
The strategy on those inspect deployments in the flow. |
|
|
Gateway arguments#
Name |
Description |
Type |
Default |
---|---|---|---|
|
The name of this object. |
|
|
|
The working directory for any IO operations in this object. If not set, then derive from its parent |
|
|
|
The YAML config of the logger used in this object. |
|
|
|
If set, then no log will be emitted from this object. |
|
|
|
If set, then exception stack information will not be added to the log |
|
|
|
The timeout in milliseconds of the control request, -1 for waiting forever |
|
|
|
The entrypoint command overrides the ENTRYPOINT in Docker image. when not set then the Docker image ENTRYPOINT takes effective. |
|
|
|
Dictionary of kwargs arguments that will be passed to Docker SDK when starting the docker ‘ |
|
|
|
Number of requests fetched from the client before feeding into the first Executor. |
|
|
|
The title of this HTTP server. It will be used in automatics docs such as Swagger UI. |
|
|
|
The description of this HTTP server. It will be used in automatics docs such as Swagger UI. |
|
|
|
If set, a CORS middleware is added to FastAPI frontend to allow cross-origin access. |
|
|
|
If set, |
|
|
|
If set, |
|
|
|
A JSON string that represents a map from executor endpoints ( |
|
|
|
Dictionary of kwargs arguments that will be passed to Uvicorn server when starting the server |
|
|
|
the path to the certificate file |
|
|
|
the path to the key file |
|
|
|
If set, /graphql endpoint is added to HTTP interface. |
|
|
|
Communication protocol of the server exposed by the Gateway. This can be a single value or a list of protocols, depending on your chosen Gateway. Choose the convenient protocols from: [‘GRPC’, ‘HTTP’, ‘WEBSOCKET’]. |
|
|
|
The host address of the runtime, by default it is 0.0.0.0. |
|
|
|
If set, respect the http_proxy and https_proxy environment variables. otherwise, it will unset these proxy variables before start. gRPC seems to prefer no proxy |
|
|
|
The config of the gateway, it could be one of the followings: |
|
|
|
Dictionary of keyword arguments that will override the |
|
|
|
The customized python modules need to be imported before loading the gateway |
|
|
|
Dictionary of kwargs arguments that will be passed to the grpc server as options when starting the server, example : {‘grpc.max_send_message_length’: -1} |
|
|
|
Routing graph for the gateway |
|
|
|
Dictionary stating which filtering conditions each Executor in the graph requires to receive Documents. |
|
|
|
JSON dictionary with the input addresses of each Deployment |
|
|
|
JSON dictionary with the request metadata for each Deployment |
|
|
|
list JSON disabling the built-in merging mechanism for each Deployment listed |
|
|
|
The compression mechanism used when sending requests from the Head to the WorkerRuntimes. For more details, check https://grpc.github.io/grpc/python/grpc.html#compression. |
|
|
|
The timeout in milliseconds used when sending data requests to Executors, -1 means no timeout, disabled by default |
|
|
|
The runtime class to run inside the Pod |
|
|
|
The timeout in milliseconds of a Pod waits for the runtime to be ready, -1 for waiting forever |
|
|
|
The map of environment variables that are available inside runtime |
|
|
|
If set, the current Pod/Deployment can not be further chained, and the next |
|
|
|
If set, the Gateway will restart while serving if YAML configuration source is changed. |
|
|
|
The port for input data to bind the gateway server to, by default, random ports between range [49152, 65535] will be assigned. The port argument can be either 1 single value in case only 1 protocol is used or multiple values when many protocols are used. |
|
|
|
If set, spawn an http server with a prometheus endpoint to expose metrics |
|
|
|
The port on which the prometheus server is exposed, default is a random port between [49152, 65535] |
|
|
|
Number of retries per gRPC call. If <0 it defaults to max(3, num_replicas) |
|
|
|
If set, the sdk implementation of the OpenTelemetry tracer will be available and will be enabled for automatic tracing of requests and customer span creation. Otherwise a no-op implementation will be provided. |
|
|
|
If tracing is enabled, this hostname will be used to configure the trace exporter agent. |
|
|
|
If tracing is enabled, this port will be used to configure the trace exporter agent. |
|
|
|
If set, the sdk implementation of the OpenTelemetry metrics will be available for default monitoring and custom measurements. Otherwise a no-op implementation will be provided. |
|
|
|
If tracing is enabled, this hostname will be used to configure the metrics exporter agent. |
|
|
|
If tracing is enabled, this port will be used to configure the metrics exporter agent. |
|
|
executors
#
Collection of Executors used in the Flow.
Each item in the collection corresponds to on add()
call and specifies one Executor.
All keyword arguments passed to the Flow add()
method can be used here.
Name |
Description |
Type |
Default |
---|---|---|---|
|
The name of this object. |
|
|
|
The working directory for any IO operations in this object. If not set, then derive from its parent |
|
|
|
The YAML config of the logger used in this object. |
|
|
|
If set, then no log will be emitted from this object. |
|
|
|
If set, then exception stack information will not be added to the log |
|
|
|
The timeout in milliseconds of the control request, -1 for waiting forever |
|
|
|
The polling strategy of the Deployment and its endpoints (when |
|
|
|
The number of shards in the deployment running at the same time. For more details check https://docs.jina.ai/concepts/flow/create-flow/#complex-flow-topologies |
|
|
|
The number of replicas in the deployment |
|
|
|
If set, only native Executors is allowed, and the Executor is always run inside WorkerRuntime. |
|
|
|
The config of the executor, it could be one of the followings: |
|
|
|
Dictionary of keyword arguments that will override the |
|
|
|
Dictionary of keyword arguments that will override the |
|
|
|
Dictionary of keyword arguments that will override the |
|
|
|
Dictionary of keyword arguments that will override the |
|
|
|
The customized python modules need to be imported before loading the executor |
|
|
|
The type of array |
|
|
|
List of exceptions that will cause the Executor to shut down. |
|
|
|
Disable the built-in reduction mechanism. Set this if the reduction is to be handled by the Executor itself by operating on a |
|
|
|
Dictionary of kwargs arguments that will be passed to the grpc server as options when starting the server, example : {‘grpc.max_send_message_length’: -1} |
|
|
|
The entrypoint command overrides the ENTRYPOINT in Docker image. when not set then the Docker image ENTRYPOINT takes effective. |
|
|
|
Dictionary of kwargs arguments that will be passed to Docker SDK when starting the docker ‘ |
|
|
|
The path on the host to be mounted inside the container. |
|
|
|
This argument allows dockerized Jina Executors to discover local gpu devices. |
|
|
|
Do not automatically mount a volume for dockerized Executors. |
|
|
|
The host of the Gateway, which the client should connect to, by default it is 0.0.0.0. In the case of an external Executor ( |
|
|
|
The runtime class to run inside the Pod |
|
|
|
The timeout in milliseconds of a Pod waits for the runtime to be ready, -1 for waiting forever |
|
|
|
The map of environment variables that are available inside runtime |
|
|
|
If set, the current Pod/Deployment can not be further chained, and the next |
|
|
|
If set, the Executor will restart while serving if YAML configuration source or Executor modules are changed. If YAML configuration is changed, the whole deployment is reloaded and new processes will be restarted. If only Python modules of the Executor have changed, they will be reloaded to the interpreter without restarting process. |
|
|
|
If set, try to install |
|
|
|
The port for input data to bind to, default is a random port between [49152, 65535]. In the case of an external Executor ( |
|
|
|
If set, spawn an http server with a prometheus endpoint to expose metrics |
|
|
|
The port on which the prometheus server is exposed, default is a random port between [49152, 65535] |
|
|
|
Number of retries per gRPC call. If <0 it defaults to max(3, num_replicas) |
|
|
|
If set, the sdk implementation of the OpenTelemetry tracer will be available and will be enabled for automatic tracing of requests and customer span creation. Otherwise a no-op implementation will be provided. |
|
|
|
If tracing is enabled, this hostname will be used to configure the trace exporter agent. |
|
|
|
If tracing is enabled, this port will be used to configure the trace exporter agent. |
|
|
|
If set, the sdk implementation of the OpenTelemetry metrics will be available for default monitoring and custom measurements. Otherwise a no-op implementation will be provided. |
|
|
|
If tracing is enabled, this hostname will be used to configure the metrics exporter agent. |
|
|
|
If tracing is enabled, this port will be used to configure the metrics exporter agent. |
|
|
|
If set, always pull the latest Hub Executor bundle even it exists on local |
|
|
|
The compression mechanism used when sending requests from the Head to the WorkerRuntimes. For more details, check https://grpc.github.io/grpc/python/grpc.html#compression. |
|
|
|
The address of the uses-before runtime |
|
|
|
The address of the uses-before runtime |
|
|
|
dictionary JSON with a list of connections to configure |
|
|
|
The timeout in milliseconds used when sending data requests to Executors, -1 means no timeout, disabled by default |
|
|
Variables#
Jina Flow YAML supports variables and variable substitution according to the Github Actions syntax:
Environment variables#
Use ${{ ENV.VAR }}
to refer to the environment variable VAR
. You can find all Jina environment variables here.
Context variables#
Use ${{ CONTEXT.VAR }}
to refer to the context variable VAR
.
Context variables can be passed to f.load_config(..., context=...)
in the form of a Python dictionary.
Relative paths#
Use ${{root.path.to.var}}
to refer to the variable var
within the same YAML file, found at the provided path in the file’s structure.
Note that the only difference between environment variable syntax and relative path syntax is the omission of spaces in the latter.