Add Executors#
A Flow
orchestrates its Executor
s as a graph and will send requests to all Executors in the order specified by add()
or listed in a YAML file.
When you start a Flow, the Executor will always be running in a separate process. Multiple Executors will be running in different processes. Multiprocessing is the lowest level of separation when you run a Flow locally. When running a Flow on Kubernetes, Docker Swarm, JCloud, different Executors are running in different containers, pods or instances.
Add Executors#
Executors can be added into a Flow via add()
.
from jina import Flow
f = Flow().add()
This will add a “no-op” Executor called BaseExecutor
to the Flow.
To better identify and executor, you can change its name by passing the name
parameter:
from jina import Flow
f = Flow().add(name='myVeryFirstExecutor').add(name='secondIsBest')
The above Flow can be also defined via YAML:
jtype: Flow
executors:
- name: myVeryFirstExecutor
- name: secondIsBest
Save it as flow.yml
and run it via:
jina flow --uses flow.yml
More Flow YAML specifications can be found in Flow YAML Specification.
Define Executor types via uses
#
The type of Executor
is defined by the uses
keyword. uses
accepts a wide range of Executor. Please also beware that some usages are not support on JCloud because of security reasons and their nature of facilitating local debugging.
Local Dev |
JCloud |
|
Description |
---|---|---|---|
✅ |
❌ |
|
use |
✅ |
❌ |
|
use |
✅ |
✅ |
|
use an Executor from a YAML file defined by Executor YAML interface. |
✅ |
❌ |
|
use an Executor as Python source from Jina Hub. |
✅ |
✅ |
|
use an Executor as a Docker container from Jina Hub. |
✅ |
✅ |
|
use a Sandbox Executor hosted on Jina Hub. The Executor is running remotely on Jina Hub. |
✅ |
❌ |
|
use a pre-built Executor as a Docker container. |
Hint: Load multiple Executors from the same directory
If you want to load multiple Executor YAMLs from the same directory, you don’t need to specify the parent directory for each Executor. Instead, you can configure a common search path for all Executors:
.
├── app
│ └── ▶ main.py
└── executor
├── config1.yml
├── config2.yml
└── my_executor.py
f = Flow(extra_search_paths=['../executor']).add(uses='config1.yml').add(uses='config2.yml')
External Executors#
Usually a Flow will manage all of its Executors. External Executors are not managed by the current Flow object but by others. For example, one may want to share expensive Executors between Flows. Often these Executors are stateless, GPU based Encoders.
Those Executors are marked with the external
keyword when added to a Flow
:
from jina import Flow
Flow().add(host='123.45.67.89', port=12345, external=True)
This is adding an external Executor to the Flow. The Flow will not start or stop this Executor and assumes that is externally managed and available at 123.45.67.89:12345
You can also use external Executors with tls
enabled.
from jina import Flow
Flow().add(host='123.45.67.89', port=443, external=True, tls=True)
Hint
Using tls
to connect to the External Executor is especially needed if you want to use an external Executor deployed with JCloud. See the JCloud documentation
for further details
Floating Executors#
Some Executors in your Flow may be used for asynchronous background tasks that can take some time and that do not generate a needed output. For instance, logging specific information in external services, storing partial results, etc.
You can unblock your Flow from such tasks by using floating Executors.
Normally, all Executors form a pipeline that handles and transforms a given request until it is finally returned to the Client.
However, floating Executors do not feed their outputs back to the pipeline. Therefore, this output will not form the response for the Client, and the response can be returned without waiting for the floating Executor to complete his task.
Those Executors are marked with the floating
keyword when added to a Flow
:
import time
from jina import Flow, Executor, requests, DocumentArray
class FastChangingExecutor(Executor):
@requests()
def foo(self, docs, **kwargs):
for doc in docs:
doc.text = 'Hello World'
class SlowChangingExecutor(Executor):
@requests()
def foo(self, docs, **kwargs):
time.sleep(2)
print(f' Received {docs.texts}')
for doc in docs:
doc.text = 'Change the document but will not affect response'
f = (
Flow()
.add(name='executor0', uses=FastChangingExecutor)
.add(
name='floating_executor',
uses=SlowChangingExecutor,
needs=['gateway'],
floating=True,
)
)
with f:
f.post(on='/endpoint', inputs=DocumentArray.empty(1)) # we need to send a first
start_time = time.time()
response = f.post(on='/endpoint', inputs=DocumentArray.empty(2))
end_time = time.time()
print(f' Response time took {end_time - start_time}s')
print(f' {response.texts}')
Response time took 0.011997222900390625s
['Hello World', 'Hello World']
Received ['Hello World', 'Hello World']
In this example you can see how the response is returned without waiting for the floating
Executor to complete. However, the Flow is not closed until
the request has been handled also by it.
You can plot the Flow and observe how the Executor is floating disconnected from the Gateway.
A floating Executor can never come before a non-floating Executor in the topology of your Flow.
This leads to the following behaviors:
Implicit reordering: When adding a non-floating Executor after a floating Executor without specifying its
needs
parameter, the non-floating Executor is chained after the previous non-floating one.
from jina import Flow
f = Flow().add().add(name='middle', floating=True).add()
f.plot()
Chaining floating Executors: If you want to chain more than one floating Executor, you need to add all of them with the
floating
flag, and explicitly specify theneeds
argument.
from jina import Flow
f = Flow().add().add(name='middle', floating=True).add(needs=['middle'], floating=True)
f.plot()
Overriding of
floating
flag: If you try to add a floating Executor as part ofneeds
parameter of a non-floating Executor, then the floating Executor is not considered floating anymore.
from jina import Flow
f = Flow().add().add(name='middle', floating=True).add(needs=['middle'])
f.plot()
Config Executors#
You can set and override Executor
configs when adding them into a Flow
.
This example shows how to start a Flow with an Executor via the Python API:
from jina import Flow
with Flow().add(
uses='MyExecutor',
uses_with={"parameter_1": "foo", "parameter_2": "bar"},
py_modules=["executor.py"],
uses_metas={
"name": "MyExecutor",
"description": "MyExecutor does a thing to the stuff in your Documents",
},
uses_requests={"/index": "my_index", "/search": "my_search", "/random": "foo"},
workspace="some_custom_path",
) as f:
...
uses_with
is a key-value map that defines the arguments of the Executor’__init__
method.uses_requests
is a key-value map that defines the mapping from endpoint to class method. Useful if one needs to overwrite the default endpoint-to-method mapping defined in the Executor python implementation.workspace
is a string value that defines the workspace.py_modules
is a list of strings that defines the Python dependencies of the executor;uses_metas
is a key-value map that defines some internal attributes of the Executor. It contains the following fields:name
is a string that defines the name of the executor;description
is a string that defines the description of this executor. It will be used in automatic docs UI;
Set with
via uses_with
#
To set/override the with
configs of an executor, use uses_with
. The with
configuration refers to user-defined
constructor kwargs.
from jina import Executor, requests, Flow
class MyExecutor(Executor):
def __init__(self, param1=1, param2=2, param3=3, *args, **kwargs):
super().__init__(*args, **kwargs)
self.param1 = param1
self.param2 = param2
self.param3 = param3
@requests
def foo(self, docs, **kwargs):
print('param1:', self.param1)
print('param2:', self.param2)
print('param3:', self.param3)
flow = Flow().add(uses=MyExecutor, uses_with={'param1': 10, 'param3': 30})
with flow as f:
f.post('/')
executor0@219662[L]:ready and listening
gateway@219662[L]:ready and listening
Flow@219662[I]:🎉 Flow is ready to use!
🔗 Protocol: GRPC
🏠 Local access: 0.0.0.0:32825
🔒 Private network: 192.168.1.101:32825
🌐 Public address: 197.28.82.165:32825
param1: 10
param2: 2
param3: 30
Set requests
via uses_requests
#
You can set/override the requests
configuration of an executor and bind methods to endpoints that you provide.
In the following codes, we replace the endpoint /foo
binded to the foo()
function with both /non_foo
and /alias_foo
.
And add a new endpoint /bar
for binding bar()
. Note the all_req()
function is binded to all the endpoints except those explicitly bound to other functions, i.e. /non_foo
, /alias_foo
and /bar
.
from jina import Executor, requests, Flow
class MyExecutor(Executor):
@requests
def all_req(self, parameters, **kwargs):
print(f'all req {parameters.get("recipient")}')
@requests(on='/foo')
def foo(self, parameters, **kwargs):
print(f'foo {parameters.get("recipient")}')
def bar(self, parameters, **kwargs):
print(f'bar {parameters.get("recipient")}')
flow = Flow().add(
uses=MyExecutor,
uses_requests={
'/bar': 'bar',
'/non_foo': 'foo',
'/alias_foo': 'foo',
},
)
with flow as f:
f.post('/bar', parameters={'recipient': 'bar()'})
f.post('/non_foo', parameters={'recipient': 'foo()'})
f.post('/foo', parameters={'recipient': 'all_req()'})
f.post('/alias_foo', parameters={'recipient': 'foo()'})
executor0@221058[L]:ready and listening
gateway@221058[L]:ready and listening
Flow@221058[I]:🎉 Flow is ready to use!
🔗 Protocol: GRPC
🏠 Local access: 0.0.0.0:36507
🔒 Private network: 192.168.1.101:36507
🌐 Public address: 197.28.82.165:36507
bar bar()
foo foo()
all req all_req()
foo foo()
Set metas
via uses_metas
#
To set/override the metas
configuration of an executor, use uses_metas
:
from jina import Executor, requests, Flow
class MyExecutor(Executor):
@requests
def foo(self, docs, **kwargs):
print(self.metas.name)
flow = Flow().add(
uses=MyExecutor,
uses_metas={'name': 'different_name'},
)
with flow as f:
f.post('/')
executor0@219291[L]:ready and listening
gateway@219291[L]:ready and listening
Flow@219291[I]:🎉 Flow is ready to use!
🔗 Protocol: GRPC
🏠 Local access: 0.0.0.0:58827
🔒 Private network: 192.168.1.101:58827
different_name
Unify NDArray types in output#
Different Executor
s in a Flow
may depend on slightly different types
for array-like data such as doc.tensor
and doc.embedding
,
for example because they were written using different machine learning frameworks.
As the builder of a Flow you don’t always have control over this, for example when using Executors from the Jina Hub.
In order to facilitate the integration between different Executors, the Flow allows you to convert tensor
and embedding
by using the f.add(..., output_array_type=..)
:
from jina import Flow
f = Flow().add(uses=MyExecutor, output_array_type='numpy').add(uses=NeedsNumpyExecutor)
This converts the .tensor
and .embedding
fields of all output Documents of MyExecutor
to numpy.ndarray
, making the data
usable by NeedsNumpyExecutor
. This works regardless of whether MyExecutor populates these fields with arrays/tensors from
PyTorch, TensorFlow, or any other popular ML framework.
Output types
output_array_type=
supports more types than 'numpy'
. For a full specification, and further details, take a look at the
documentation about protobuf serialization.