Basics#
A Flow
defines how your Executors are connected together and how your data flows through them.
Create#
The most trivial Flow
is the empty Flow. It can be defined purely in Python or from a YAML file:
from jina import Flow
f = Flow()
jtype: Flow
Tip
An empty Flow contains only the Gateway.
For production, you should define your Flows with YAML. This is because YAML files are independent of the Python logic code and easier to maintain.
Conversion between Python and YAML#
A Python Flow definition can be easily converted to/from a YAML definition.
To load a Flow from a YAML file, use load_config()
:
from jina import Flow
f = Flow.load_config('flow.yml')
To export an existing Flow definition to a YAML file use save_config()
:
from jina import Flow
f = Flow().add().add() # Create a Flow with two Executors
f.save_config('flow.yml')
Start and stop#
When a Flow
starts, all its added Executors will start as well, making it possible to reach the service through its API.
There are three ways to start a Flow: In Python, from a YAML file, or from the terminal.
Generally in Python: use Flow as a context manager in Python.
As an entrypoint from terminal: use
Jina CLI <cli>
and a Flow YAML file.As an entrypoint from Python code: use Flow as a context manager inside
if __name__ == '__main__'
from jina import Flow
f = Flow()
with f:
pass
jina flow --uses flow.yml
from jina import Flow
f = Flow()
if __name__ == '__main__':
with f:
pass
from jina import Flow
f = Flow()
f.start()
f.close()
The statement with f:
starts the Flow, and exiting the indented with
block stops the Flow, including all Executors defined in it.
A successful start of a Flow looks like this:
Your addresses and entrypoints can be found in the output. When you enable more features such as monitoring, HTTP gateway, TLS encryption, this display expands to contain more information.
Set multiprocessing spawn
#
Some corner cases require forcing a spawn
start method for multiprocessing, for example if you encounter “Cannot re-initialize CUDA in forked subprocess”.
You can use JINA_MP_START_METHOD=spawn
before starting the Python script to enable this.
JINA_MP_START_METHOD=spawn python app.py
Caution
In case you set JINA_MP_START_METHOD=spawn
, make sure to use Flow as a context manager inside if __name__ == '__main__'
.
The script entrypoint (starting the flow) needs to be protected when using spawn
start method.
Hint
There’s no need to set this for Windows, as it only supports spawn method for multiprocessing.
Serve forever#
In most scenarios, a Flow should remain reachable for prolonged periods of time.
This can be achieved by jina flow --uses flow.yml
from the terminal.
Or if you are serving a Flow from Python:
from jina import Flow
f = Flow()
with f:
f.block()
The .block()
method blocks the execution of the current thread or process, enabling external clients to access the Flow.
In this case, the Flow can be stopped by interrupting the thread or process.
Server until an event#
Alternatively, a multiprocessing
or threading
Event
object can be passed to .block()
, which stops the Flow once set.
from jina import Flow
import threading
def start_flow(stop_event):
"""start a blocking Flow."""
with Flow() as f:
f.block(stop_event=stop_event)
e = threading.Event() # create new Event
t = threading.Thread(name='Blocked-Flow', target=start_flow, args=(e,))
t.start() # start Flow in new Thread
# do some stuff
e.set() # set event and stop (unblock) the Flow
Serve on Google Colab#
Google Colab provides an easy-to-use Jupyter notebook environment with GPU/TPU support. Flow is fully compatible with Google Colab and you can use it in the following ways:
Open the notebook on Google Colab
Please follow the walk through and enjoy the free GPU/TPU!
Tip
Hosing services on Google Colab is not recommended if your server aims to be long-lived or permanent. It is often used for quick experiments, demonstrations or leveraging its free GPU/TPU. For stable, secure and free hosting of Jina Flow, check out JCloud.
Visualize#
A Flow
has a built-in .plot()
function which can be used to visualize the Flow
:
from jina import Flow
f = Flow().add().add()
f.plot('flow.svg')
from jina import Flow
f = Flow().add(name='e1').add(needs='e1').add(needs='e1')
f.plot('flow-2.svg')
You can also do it in the terminal:
jina export flowchart flow.yml flow.svg
You can also visualize a remote Flow by passing the URL to jina export flowchart
.
Export#
A Flow
YAML can be exported as a Docker Compose YAML or a Kubernetes YAML bundle.
Docker Compose#
from jina import Flow
f = Flow().add()
f.to_docker_compose_yaml()
You can also do it in the terminal:
jina export docker-compose flow.yml docker-compose.yml
This will generate a single docker-compose.yml
file containing all the Executors of the Flow.
For advanced utilization of Docker Compose with Jina, refer to How to
Kubernetes#
from jina import Flow
f = Flow().add()
f.to_kubernetes_yaml('flow_k8s_configuration')
You can also do it in the terminal:
jina export kubernetes flow.yml ./my-k8s
This generates the Kubernetes configuration files for all the Executor
s in the Flow.
The generated folder can be used directly with kubectl
to deploy the Flow to an existing Kubernetes cluster.
For advanced utilisation of Kubernetes with Jina please refer to How to
Tip
Based on your local Jina version, Executor Hub may rebuild the Docker image during the YAML generation process.
If you do not wish to rebuild the image, set the environment variable JINA_HUB_NO_IMAGE_REBUILD
.
Tip
If an Executor requires volumes to be mapped for them to persist data, Jina will create a StatefulSet for that Executor instead of a Deployment.
You can control the access mode, storage class name and capacity of the attached Persistent Volume Claim by using Jina environment variables
JINA_K8S_ACCESS_MODES
, JINA_K8S_STORAGE_CLASS_NAME
and JINA_K8S_STORAGE_CAPACITY
. Only the first volume will be considered to be mounted.
See also
For more in-depth guides on Flow deployment, check our how-tos for Docker compose and Kubernetes.