#
YAML specificationJCloud extends Jina’s Flow YAML specification by introducing the special field jcloud
. This lets you define resources and scaling policies for each Executor and gateway.
Here’s a Flow with two Executors that have specific resource needs. indexer
demands 10G ebs
disk, whereas encoder
demands two cores, 8G RAM and two dedicated GPUs.
jtype: Flow
executors:
- name: encoder
uses: jinaai+docker://<username>/Encoder
jcloud:
resources:
cpu: 2
memory: 8G
gpu: 2
- name: indexer
uses: jinaai+docker://<username>/Indexer
jcloud:
resources:
storage:
type: ebs
size: 10G
Allocate Executor resources#
Since each Executor has its own business logic, it may require different cloud resources. One Executor might need more RAM, whereas another might need a bigger disk.
In JCloud, you can pass highly customizable, finely-grained resource requests for each Executor using the jcloud.resources
argument in your Flow YAML.
CPU#
By default, 0.1 (1/10 of a core)
CPU is allocated to each Executor. You can use the cpu
argument under resources
to change it.
JCloud offers the general Intel Xeon processor (Skylake 8175M or Cascade Lake 8259CL) by default.
Hint
Maximum of 16 cores is allowed per Executor.
jtype: Flow
executors:
- name: executor1
uses: jinaai+docker://<username>/Executor1
jcloud:
resources:
cpu: 0.5
GPU#
JCloud supports GPU workloads with two different usages: shared
or dedicated
.
If GPU is enabled, JCloud will provide NVIDIA A10G Tensor Core GPUs with 24G memory for workloads in both usage types.
Hint
When using GPU resources, it may take a few extra minutes before all Executors are ready to serve traffic.
Dedicated GPU#
Using a dedicated GPU is the default way to provision GPU for an Executor. This automatically creates nodes or assigns the Executor to land on a GPU node. In this case, the Executor owns the whole GPU. You can assign between 1 and 4 GPUs.
jtype: Flow
executors:
- name: executor1
uses: jinaai+docker://<username>/Executor1
jcloud:
resources:
gpu: 2
Spot vs on-demand instance#
For cost optimization, JCloud tries to deploy all Executors on spot
capacity. This is ideal for stateless Executors, which can withstand interruptions and restarts. It is recommended to use on-demand
capacity for stateful Executors (e.g. indexers) however.
jtype: Flow
executors:
- name: executor1
uses: jinaai+docker://<username>/Executor1
jcloud:
resources:
capacity: on-demand
Memory#
By default, 100M
of RAM is allocated to each Executor. You can use the memory
argument under resources
to change it.
Hint
Maximum of 16G RAM is allowed per Executor.
jtype: Flow
executors:
- name: executor1
uses: jinaai+docker://<username>/Executor1
jcloud:
resources:
memory: 8G
Storage#
JCloud supports two kinds of storage types: efs (default) and ebs. The former is a network file storage, whereas the latter is a block device.
Hint
By default, we attach an efs
to all Executors in a Flow. This lets the efs
resize dynamically, so you don’t need to shrink/grow volumes manually.
If your Executor needs high IO, you can use ebs
instead. Please note that:
The disk cannot be shared with other Executors or Flows.
You must pass a storage size parameter (default:
1G
, max10G
).
jtype: Flow
executors:
- name: executor1
uses: jinaai+docker://<username>/Executor1
jcloud:
resources:
storage:
type: ebs
size: 10G
- name: executor2
uses: jinaai+docker://<username>/Executor2
jcloud:
resources:
storage:
type: efs
Scale out Executors#
On JCloud, demand-based autoscaling functionality is naturally offered thanks to the underlying Kubernetes architecture. This means that you can maintain serverless deployments in a cost-effective way with no headache of setting the right number of replicas anymore!
Autoscaling with jinahub+serveless://
#
The easiest way to scale out your Executor is to use a Serverless Executor. This can be enabled by using jinaai+serverless://
instead of jinaai+docker://
in Executor’s uses
, such as:
jtype: Flow
executors:
- name: executor1
uses: jinaai+serverless://<username>/Executor1
JCloud autoscaling leverages Knative behind the scenes, and jinahub+serverless
uses a set of Knative configurations as defaults.
Hint
For more information about the Knative autoscaling configurations, please visit Knative autoscaling.
Autoscaling with custom args#
If jinaai+serverless://
doesn’t meet your requirements, you can further customize autoscaling configurations by using the autoscale
argument on a per-Executor basis in the Flow YAML, such as:
jtype: Flow
executors:
- name: executor1
uses: jinaai+docker://<username>/Executor1
jcloud:
autoscale:
min: 1
max: 2
metric: rps
target: 50
Below are the defaults and requirements for the configurations:
Name |
Default |
Allowed |
Description |
---|---|---|---|
min |
1 |
int |
Minimum number of replicas ( |
max |
2 |
int, up to 5 |
Maximum number of replicas |
metric |
concurrency |
|
Metric for scaling |
target |
100 |
int |
Target number after which replicas autoscale |
After JCloud deployment using the autoscaling configuration, the Flow serving part is just the same; the only difference you may notice is it takes a few extra seconds to handle the initial requests since it needs to scale the deployments behind the scenes. Let JCloud handle the scaling from now on, and you should only worry about the code!
Configure gateway#
JCloud provides support Ingress gateways to expose your Flows to the public internet with TLS.
In JCloud. We use Let’s Encrypt for TLS.
Hint
The JCloud gateway is different from Jina’s gateway. In JCloud, a gateway works as a proxy to distribute internet traffic between Flows, each of which has a Jina gateway (which is responsible for managing external gRPC/HTTP/Websocket traffic to your Executors)
Set timeout#
By default, the JCloud gateway will close connections that have been idle for over 600 seconds. If you want a longer connection timeout threshold, change the timeout
parameter under gateway.jcloud
.
jtype: Flow
gateway:
jcloud:
timeout: 600
executors:
- name: executor1
uses: jinaai+docker://<username>/Executor1
Control gateway resources#
To customize the gateway’s CPU or memory, specify the memory
and/or cpu
arguments under gateway.jcloud.resources
:
jtype: Flow
gateway:
jcloud:
resources:
memory: 800M
cpu: 0.4
executors:
- name: encoder
uses: jinaai+docker://<username>/Encoder
Expose Executors#
A Flow deployment without a Gateway is often used for Add external Executors, which can be shared between different Flows. You can expose an Executor by setting expose: true
(and un-expose the Gateway by setting expose: false
):
jtype: Flow
gateway:
jcloud:
expose: false # don't expose the Gateway
executors:
- name: custom
uses: jinaai+docker://<username>/CustomExecutor
jcloud:
expose: true # expose the Executor
You can expose the Gateway along with Executors:
jtype: Flow
gateway:
jcloud:
expose: true
executors:
- name: custom1
uses: jinaai+docker://<username>/CustomExecutor1
jcloud:
expose: true # expose the Executor
Other deployment options#
Specify Jina version#
To control Jina’s version while deploying a Flow to jcloud
, you can pass the version
argument in the Flow YAML:
jtype: Flow
jcloud:
version: 3.10.0
executors:
- name: executor1
uses: jinaai+docker://<username>/Executor1
Add Labels#
You can use labels
(as key-value pairs) to attach metadata to your Flows:
jtype: Flow
jcloud:
labels:
username: johndoe
app: fashion-search
executors:
- name: executor1
uses: jinaai+docker://<username>/Executor1
Hint
Keys in labels
have the following restrictions:
Must be 63 characters or fewer.
Must begin and end with an alphanumeric character ([a-z0-9A-Z]) with dashes (-), underscores (_), dots (.), and alphanumerics between.
The following keys are skipped if passed in the Flow YAML.
user
jina
-version