Jina Python API Reference¶
- jina.clients
- jina.docker
- jina.drivers
- jina.executors
- jina.executors.classifiers
- jina.executors.crafters
- jina.executors.encoders
- jina.executors.evaluators
- jina.executors.indexers
- jina.executors.rankers
- jina.executors.segmenters
- jina.executors.clients
- jina.executors.compound
- jina.executors.decorators
- jina.executors.devices
- jina.executors.metas
- jina.executors.requests
- jina.flow
- jina.helloworld
- jina.jaml
- jina.logging
- jina.optimizers
- jina.parsers
- jina.peapods
- jina.proto
- jina.types
Top-level module of Jina.
The primary function of this module is to import all of the public Jina interfaces into a single place. The interfaces themselves are located in sub-modules, as described below.
-
class
jina.
NdArray
(proto=None, is_sparse=False, dense_cls=<class 'jina.types.ndarray.dense.numpy.DenseNdArray'>, sparse_cls=<class 'jina.types.ndarray.sparse.scipy.SparseNdArray'>, *args, **kwargs)[source]¶ Bases:
jina.types.ndarray.BaseNdArray
NdArray
is one of the primitive data type in Jina.It offers a Pythonic interface to allow users access and manipulate
jina.jina_pb2.NdArrayProto
object without working with Protobuf itself.A generic view of the Protobuf NdArray, unifying the view of DenseNdArray and SparseNdArray
This class should be used in nearly all the Jina context.
Simple usage:
# start from empty proto a = NdArray() # start from an existig proto a = NdArray(doc.embedding) # set value a.value = np.random.random([10, 5]) # get value print(a.value) # set value to a TF sparse tensor a.is_sparse = True a.value = SparseTensor(...) print(a.value)
Advanced usage:
NdArray
also takes a dense NdArray and a sparse NdArray constructor as arguments. You can consider them as the backend for dense and sparse NdArray. The combination is your choice, it could be:# numpy (dense) + scipy (sparse) from .dense.numpy import DenseNdArray from .sparse.scipy import SparseNdArray NdArray(dense_cls=DenseNdArray, sparse_cls=SparseNdArray) # numpy (dense) + pytorch (sparse) from .dense.numpy import DenseNdArray from .sparse.pytorch import SparseNdArray NdArray(dense_cls=DenseNdArray, sparse_cls=SparseNdArray) # numpy (dense) + tensorflow (sparse) from .dense.numpy import DenseNdArray from .sparse.tensorflow import SparseNdArray NdArray(dense_cls=DenseNdArray, sparse_cls=SparseNdArray)
Once you set sparse_cls, it will only accept the data type in that particular type. That is, you can not use a
NdArray
equipped with Tensorflow sparse to set/get Pytorch or Scipy sparse matrices.- Parameters
proto (
Optional
[NdArrayProto
]) – the protobuf message, when not given then create a new one viaget_null_proto()
is_sparse (
bool
) – if the ndarray is sparse, can be changed laterdense_cls (
Type
[BaseDenseNdArray
]) – the to-be-used class for DenseNdArray when is_sparse=Falsesparse_cls (
Type
[BaseSparseNdArray
]) – the to-be-used class for SparseNdArray when is_sparse=Trueargs –
kwargs –
-
property
value
¶ Return the value of the ndarray, in numpy, scipy, tensorflow, pytorch type
-
class
jina.
Request
(request=None, envelope=None, copy=False)[source]¶ Bases:
object
Request
is one of the primitive data type in Jina.It offers a Pythonic interface to allow users access and manipulate
jina.jina_pb2.RequestProto
object without working with Protobuf itself.A container for serialized
jina_pb2.RequestProto
that only triggers deserialization and decompression when receives the first read access to its member.It overrides
__getattr__()
to provide the same get/set interface as anjina_pb2.RequestProto
object.-
is_used
= None¶ Return True when request has been r/w at least once
-
property
body
¶
-
property
request_type
¶ Return the request body type, when not set yet, return
None
- Return type
Optional
[str
]
-
property
docs
¶ - Return type
-
property
groundtruths
¶ - Return type
-
property
as_pb_object
¶ - Cast
self
to ajina_pb2.RequestProto
. This will trigger is_used
. Laziness will be broken and serialization will be recomputed when callingSerializeToString()
.
- Return type
RequestProto
- Cast
-
property
queryset
¶ - Return type
-
property
command
¶ - Return type
str
-
-
class
jina.
Response
(request=None, envelope=None, copy=False)[source]¶ Bases:
jina.types.request.Request
Response is the
Request
object returns from the flow. Right now it shares the same representation asRequest
. At 0.8.12,Response
is a simple alias. But it does give a more consistent semantic on the client API: send aRequest
and receive aResponse
.
-
class
jina.
Message
(envelope, request, *args, **kwargs)[source]¶ Bases:
object
Message
is one of the primitive data type in Jina.It offers a Pythonic interface to allow users access and manipulate
jina.jina_pb2.MessageProto
object without working with Protobuf itself.A container class for
jina_pb2.MessageProto
. Note, the Protobuf version ofjina_pb2.MessageProto
contains ajina_pb2.EnvelopeProto
andjina_pb2.RequestProto
. Here, it contains:a
jina_pb2.EnvelopeProto
object- and one of:
a
Request
object wrappingjina_pb2.RequestProto
a
jina_pb2.RequestProto
object
It provide a generic view of as
jina_pb2.MessageProto
, allowing one to access its member, request and envelope as if usingjina_pb2.MessageProto
object directly.This class also collected all helper functions related to
jina_pb2.MessageProto
into one place.-
property
as_pb_object
¶ - Return type
MessageProto
-
property
is_data_request
¶ check if the request is not a control request
Warning
If
request
change the type, e.g. by leveraging the feature ofoneof
, this property wont be updated. This is not considered as a good practice.- Return type
bool
-
property
colored_route
¶ Get the string representation of the routes in a message.
- Return type
str
- Returns
-
property
response
¶ Get the response of the message in protobuf.
Note
This should be only called at Gateway
- Return type
-
add_exception
(ex=None, executor=None)[source]¶ Add exception to the last route in the envelope
- Parameters
ex (
Optional
[ForwardRef
]) – Exception to be added- Return type
None
- Returns
-
property
is_error
¶ - Return type
bool
-
property
is_ready
¶ - Return type
bool
-
class
jina.
QueryLang
(querylang=None, copy=False)[source]¶ Bases:
object
QueryLang
is one of the primitive data type in Jina.It offers a Pythonic interface to allow users access and manipulate
jina.jina_pb2.QueryLangProto
object without working with Protobuf itself.- To create a
QueryLang
object from a Dict containing the name of aBaseDriver
, and the parameters to override, simply:
from jina import QueryLang ql = QueryLang({name: 'SliceQL', priority: 1, parameters: {'start': 3, 'end': 1}})
Warning
The BaseDriver needs to be a QuerySetReader to be able to read the QueryLang
One can also build a :class`QueryLang` from JSON string, bytes, dict or directly from a protobuf object.
A
QueryLang
object (no matter how it is constructed) can be converted to protobuf object by using:# to protobuf object ql.as_pb_object
- Parameters
querylang (
Optional
[~QueryLangSourceType]) – the query language source to construct from, acceptable types include:jina_pb2.QueryLangProto
,bytes
,str
,Dict
, Tuple.copy (
bool
) – whenquerylang
is given as aQueryLangProto
object, build a view (i.e. weak reference) from it or a deep copy from it.
-
property
priority
¶ Get the priority of this query language. The query language only takes effect when if it has a higher priority than the internal one with the same name
- Return type
int
-
property
name
¶ Get the name of the driver that the query language attached to
- Return type
str
-
property
as_pb_object
¶ Return a protobuf
jina_pb2.QueryLangProto
object- Return type
QueryLangProto
- To create a
-
class
jina.
Document
(document=None, copy=False, **kwargs)[source]¶ Bases:
object
Document
is one of the primitive data type in Jina.It offers a Pythonic interface to allow users access and manipulate
jina.jina_pb2.DocumentProto
object without working with Protobuf itself.To create a
Document
object, simply:from jina import Document d = Document() d.text = 'abc'
Jina requires each Document to have a string id. You can set a custom one, or if non has been set a random one will be assigned.
Or you can use
Document
as a context manager:with Document() as d: d.text = 'hello' assert d.id # now `id` has value
To access and modify the content of the document, you can use
text
,blob
, andbuffer
. Each property is implemented with proper setter, to improve the integrity and user experience. For example, assigningdoc.blob
ordoc.embedding
can be simply done via:import numpy as np # to set as content d.content = np.random.random([10, 5]) # to set as embedding d.embedding = np.random.random([10, 5])
MIME type is auto set/guessed when setting
content
anduri
Document
also provides multiple way to build from existing Document. You can buildDocument
fromjina_pb2.DocumentProto
,bytes
,str
, andDict
. You can also use it as view (i.e. weak reference when building from an existingjina_pb2.DocumentProto
). For example,a = DocumentProto() b = Document(a, copy=False) a.text = 'hello' assert b.text == 'hello'
You can leverage the
convert_a_to_b()
interface to convert between content forms.- Parameters
document (
Optional
[~DocumentSourceType]) – the document to construct from. Ifbytes
is given then deserialize aDocumentProto
;dict
is given then parse aDocumentProto
from it;str
is given, then consider it as a JSON string and parse aDocumentProto
from it; finally, one can also give DocumentProto directly, then depending on thecopy
, it builds a view or a copy from it.copy (
bool
) – whendocument
is given as aDocumentProto
object, build a view (i.e. weak reference) from it or a deep copy from it.kwargs – other parameters to be set
-
property
length
¶ - Return type
int
-
property
weight
¶ Returns the weight of the document
- Return type
float
-
property
modality
¶ Get the modality of the document
- Return type
str
-
property
content_hash
¶
-
update_content_hash
(exclude_fields=('id', 'chunks', 'matches', 'content_hash', 'parent_id'), include_fields=None)[source]¶ Update the document hash according to its content.
- Parameters
exclude_fields (
Optional
[Tuple
[str
]]) – a tuple of field names that excluded when computing content hashinclude_fields (
Optional
[Tuple
[str
]]) – a tuple of field names that included when computing content hash
Note
“exclude_fields” and “include_fields” are mutually exclusive, use one only
- Return type
None
-
property
id
¶ The document id in hex string, for non-binary environment such as HTTP, CLI, HTML and also human-readable. it will be used as the major view.
- Return type
-
property
parent_id
¶ The document’s parent id in hex string, for non-binary environment such as HTTP, CLI, HTML and also human-readable. it will be used as the major view.
- Return type
-
property
blob
¶ Return
blob
, one of the content form of a Document.Note
Use
content
to return the content of a Document- Return type
ndarray
-
property
embedding
¶ Return
embedding
of the content of a Document.- Return type
ndarray
-
set_attrs
(**kwargs)[source]¶ Bulk update Document fields with key-value specified in kwargs
See also
get_attrs()
for bulk get attributes
-
get_attrs
(*args)[source]¶ Bulk fetch Document fields and return a dict of the key-value pairs
See also
update()
for bulk set/update attributes- Return type
Dict
[str
,Any
]
-
property
as_pb_object
¶ - Return type
DocumentProto
-
property
buffer
¶ Return
buffer
, one of the content form of a Document.Note
Use
content
to return the content of a Document- Return type
bytes
-
property
text
¶ Return
text
, one of the content form of a Document.Note
Use
content
to return the content of a Document
-
property
uri
¶ - Return type
str
-
property
mime_type
¶ Get MIME type of the document
- Return type
str
-
property
content_type
¶ Return the content type of the document, possible values: text, blob, buffer
- Return type
str
-
property
content
¶ Return the content of the document. It checks whichever field among
blob
,text
,buffer
has value and return it.- Return type
~DocumentContentType
-
property
granularity
¶
-
property
score
¶
-
convert_buffer_to_blob
(**kwargs)[source]¶ Assuming the
buffer
is a _valid_ buffer of Numpy ndarray, setblob
accordingly.- Parameters
kwargs – reserved for maximum compatibility when using with ConvertDriver
Note
One can only recover values not shape information from pure buffer.
-
convert_uri_to_buffer
(**kwargs)[source]¶ Convert uri to buffer Internally it downloads from the URI and set
buffer
.- Parameters
kwargs – reserved for maximum compatibility when using with ConvertDriver
-
convert_uri_to_data_uri
(charset='utf-8', base64=False, **kwargs)[source]¶ Convert uri to data uri. Internally it reads uri into buffer and convert it to data uri
- Parameters
charset (
str
) – charset may be any character set registered with IANAbase64 (
bool
) – used to encode arbitrary octet sequences into a form that satisfies the rules of 7bit. Designed to be efficient for non-text 8 bit and binary data. Sometimes used for text data that frequently uses non-US-ASCII characters.kwargs – reserved for maximum compatibility when using with ConvertDriver
-
convert_buffer_to_uri
(charset='utf-8', base64=False, **kwargs)[source]¶ Convert buffer to data uri. Internally it first reads into buffer and then converts it to data URI.
- Parameters
charset (
str
) – charset may be any character set registered with IANAbase64 (
bool
) – used to encode arbitrary octet sequences into a form that satisfies the rules of 7bit. Designed to be efficient for non-text 8 bit and binary data. Sometimes used for text data that frequently uses non-US-ASCII characters.kwargs – reserved for maximum compatibility when using with ConvertDriver
-
convert_text_to_uri
(charset='utf-8', base64=False, **kwargs)[source]¶ Convert text to data uri.
- Parameters
charset (
str
) – charset may be any character set registered with IANAbase64 (
bool
) – used to encode arbitrary octet sequences into a form that satisfies the rules of 7bit.
Designed to be efficient for non-text 8 bit and binary data. Sometimes used for text data that frequently uses non-US-ASCII characters. :param kwargs: reserved for maximum compatibility when using with ConvertDriver
-
convert_uri_to_text
(**kwargs)[source]¶ Assuming URI is text, convert it to text
- Parameters
kwargs – reserved for maximum compatibility when using with ConvertDriver
-
class
jina.
MultimodalDocument
(document=None, chunks=None, modality_content_map=None, copy=False, **kwargs)[source]¶ Bases:
jina.types.document.Document
MultimodalDocument
is a data type created based on Jina primitive data typeDocument
.It shares the same methods and properties with
Document
, while it focus on modality at chunk level.Warning
It assumes that every
chunk
of adocument
belongs to a different modality.It assumes that every
MultimodalDocument
have at least two chunks.
- Parameters
document (
Optional
[~DocumentSourceType]) – the document to construct from. Ifbytes
is given then deserialize aDocumentProto
;dict
is given then parse aDocumentProto
from it;str
is given, then consider it as a JSON string and parse aDocumentProto
from it; finally, one can also give DocumentProto directly, then depending on thecopy
, it builds a view or a copy from it.chunks (
Optional
[Sequence
[Document
]]) – the chunks of the multimodal document to initialize with. Expected to received a list ofDocument
, with different modalities.copy (
bool
) – whendocument
is given as aDocumentProto
object, build a view (i.e. weak reference) from it or a deep copy from it.kwargs – other parameters to be set
- Param
modality_content_mapping: A Python dict, the keys are the modalities and the values are the
content
of theDocument
Warning
Build
MultimodalDocument
frommodality_content_mapping
expects you assignDocument.content
as the value of the dictionary.
-
property
is_valid
¶ A valid
MultimodalDocument
should meet the following requirements:Document should consist at least 2 chunks.
Length of modality is not identical to length of chunks.
- Return type
bool
-
property
modality_content_map
¶ Get the mapping of modality and content, the mapping is represented as a
dict
, the keys are the modalities of the chunks, the values are the corresponded content of the chunks.- Return type
Dict
- Returns
the mapping of modality and content extracted from chunks.
-
property
modalities
¶ Get all modalities of the
MultimodalDocument
.- Return type
List
[str
]- Returns
List of modalities extracted from chunks of the document.
-
class
jina.
DocumentSet
(docs_proto)[source]¶ Bases:
collections.abc.MutableSequence
DocumentSet
is a mutable sequence ofDocument
, it gives an efficient view of a list of Document. One can iterate over it like a generator but ALSO modify it, count it, get item, or union two ‘DocumentSet’s using the ‘+’ and ‘+=’ operators.-
extend
(iterable)[source]¶ S.extend(iterable) – extend sequence by appending elements from the iterable
- Return type
None
-
build
()[source]¶ Build a doc_id to doc mapping so one can later index a Document using doc_id as string key
-
property
all_embeddings
¶ Return all embeddings from every document in this set as a ndarray
- :return a tuple of embedding in
np.ndarray
, the corresponding documents in a
DocumentSet
, and the documents have no embedding in aDocumentSet
.
- Return type
Tuple
[ndarray
,DocumentSet
,DocumentSet
]
- :return a tuple of embedding in
-
property
all_contents
¶ Return all embeddings from every document in this set as a ndarray
- Return type
Tuple
[ndarray
,DocumentSet
,DocumentSet
]- Returns
a tuple of embedding in
np.ndarray
, the corresponding documents in aDocumentSet
, and the documents have no contents in aDocumentSet
.
-
-
class
jina.
QueryLangSet
(querylang_protos)[source]¶ Bases:
collections.abc.MutableSequence
QueryLangSet
is a mutable sequence ofQueryLang
, it gives an efficient view of a list of Document. One can iterate over it like a generator but ALSO modify it, count it, get item.
-
class
jina.
Flow
(args=None, env=None, **kwargs)[source]¶ Bases:
jina.flow.base.BaseFlow
Initialize a flow object
- Parameters
kwargs – other keyword arguments that will be shared by all pods in this flow
More explain on
optimize_level
:As an example, the following flow will generate 6 Peas,
f = Flow(optimize_level=FlowOptimizeLevel.NONE).add(uses='forward', parallel=3)
The optimized version, i.e.
Flow(optimize_level=FlowOptimizeLevel.FULL)
will generate 4 Peas, but it will force theGatewayPea
to take BIND role, as the head and tail routers are removed.-
train
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Do training on the current flow It will start a
CLIClient
and calltrain()
. Example, .. highlight:: python .. code-block:: python- with f:
f.train(input_fn) …
This will call the pre-built reader to read files into an iterator of bytes and feed to the flow. One may also build a reader/generator on your own. Example, .. highlight:: python .. code-block:: python
- def my_reader():
- for _ in range(10):
yield b’abcdfeg’ # each yield generates a document for training
- with f.build(runtime=’thread’) as flow:
flow.train(bytes_gen=my_reader())
- Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – An iterator of bytes. If not given, then you have to specify it in kwargs.on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
index_ndarray
(array, axis=0, size=None, shuffle=False, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Using numpy ndarray as the index source for the current flow
- Parameters
array (
ndarray
) – the numpy ndarray data sourceaxis (
int
) – iterate over that axissize (
Optional
[int
]) – the maximum number of the sub arraysshuffle (
bool
) – shuffle the the numpy data source beforehandon_done (
Optional
[Callable
[…,None
]]) – the callback function to invoke after indexingkwargs – accepts all keyword arguments of jina client CLI
-
search_ndarray
(array, axis=0, size=None, shuffle=False, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Use a numpy ndarray as the query source for searching on the current flow
- Parameters
array (
ndarray
) – the numpy ndarray data sourceaxis (
int
) – iterate over that axissize (
Optional
[int
]) – the maximum number of the sub arraysshuffle (
bool
) – shuffle the the numpy data source beforehandon_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
index_lines
(lines=None, filepath=None, size=None, sampling_rate=None, read_mode='r', on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Use a list of lines as the index source for indexing on the current flow :type lines:
Optional
[Iterator
[str
]] :param lines: a list of strings, each is considered as d document :type filepath:Optional
[str
] :param filepath: a text file that each line contains a document :type size:Optional
[int
] :param size: the maximum number of the documents :type sampling_rate:Optional
[float
] :param sampling_rate: the sampling rate between [0, 1] :param read_mode: specifies the mode in which the fileis opened. ‘r’ for reading in text mode, ‘rb’ for reading in binary
- Parameters
on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
index_files
(patterns, recursive=True, size=None, sampling_rate=None, read_mode=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Use a set of files as the index source for indexing on the current flow :type patterns:
Union
[str
,List
[str
]] :param patterns: The pattern may contain simple shell-style wildcards, e.g. ‘*.py’, ‘[*.zip, *.gz]’ :type recursive:bool
:param recursive: If recursive is true, the pattern ‘**’ will match any files andzero or more directories and subdirectories.
- Parameters
size (
Optional
[int
]) – the maximum number of the filessampling_rate (
Optional
[float
]) – the sampling rate between [0, 1]read_mode (
Optional
[str
]) – specifies the mode in which the file is opened. ‘r’ for reading in text mode, ‘rb’ for reading in binary modeon_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
search_files
(patterns, recursive=True, size=None, sampling_rate=None, read_mode=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Use a set of files as the query source for searching on the current flow :type patterns:
Union
[str
,List
[str
]] :param patterns: The pattern may contain simple shell-style wildcards, e.g. ‘*.py’, ‘[*.zip, *.gz]’ :type recursive:bool
:param recursive: If recursive is true, the pattern ‘**’ will match any files andzero or more directories and subdirectories.
- Parameters
size (
Optional
[int
]) – the maximum number of the filessampling_rate (
Optional
[float
]) – the sampling rate between [0, 1]read_mode (
Optional
[str
]) – specifies the mode in which the file is opened. ‘r’ for reading in text mode, ‘rb’ for reading inon_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
search_lines
(lines=None, filepath=None, size=None, sampling_rate=None, read_mode='r', on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Use a list of files as the query source for searching on the current flow :type filepath:
Optional
[str
] :param filepath: a text file that each line contains a document :type lines:Optional
[Iterator
[str
]] :param lines: a list of strings, each is considered as d document :type size:Optional
[int
] :param size: the maximum number of the documents :type sampling_rate:Optional
[float
] :param sampling_rate: the sampling rate between [0, 1] :param read_mode: specifies the mode in which the fileis opened. ‘r’ for reading in text mode, ‘rb’ for reading in binary
- Parameters
on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
index
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Do indexing on the current flow Example, .. highlight:: python .. code-block:: python
- with f:
f.index(input_fn) …
This will call the pre-built reader to read files into an iterator of bytes and feed to the flow. One may also build a reader/generator on your own. Example, .. highlight:: python .. code-block:: python
- def my_reader():
- for _ in range(10):
yield b’abcdfeg’ # each yield generates a document to index
- with f.build(runtime=’thread’) as flow:
flow.index(bytes_gen=my_reader())
It will start a
CLIClient
and callindex()
. :type input_fn:Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
] :param input_fn: An iterator of bytes. If not given, then you have to specify it in kwargs. :type on_done:Optional
[Callable
[…,None
]] :param on_done: the function to be called when theRequest
object is resolved. :type on_error:Optional
[Callable
[…,None
]] :param on_error: the function to be called when theRequest
object is rejected. :type on_always:Optional
[Callable
[…,None
]] :param on_always: the function to be called when theRequest
object is is either resolved or rejected. :param kwargs: accepts all keyword arguments of jina client CLI
-
update
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Updates documents on the current flow Example, .. highlight:: python .. code-block:: python
- with f:
f.update(input_fn) …
This will call the pre-built reader to read files into an iterator of bytes and feed to the flow. One may also build a reader/generator on your own. Example, .. highlight:: python .. code-block:: python
- def my_reader():
- for _ in range(10):
yield b’abcdfeg’ # each yield generates a document to update
- with f.build(runtime=’thread’) as flow:
flow.update(bytes_gen=my_reader())
It will start a
CLIClient
and callupdate()
. :type input_fn:Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
] :param input_fn: An iterator of bytes. If not given, then you have to specify it in kwargs. :type on_done:Optional
[Callable
[…,None
]] :param on_done: the function to be called when theRequest
object is resolved. :type on_error:Optional
[Callable
[…,None
]] :param on_error: the function to be called when theRequest
object is rejected. :type on_always:Optional
[Callable
[…,None
]] :param on_always: the function to be called when theRequest
object is is either resolved or rejected. :param kwargs: accepts all keyword arguments of jina client CLI
-
delete
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Do deletion on the current flow Example, .. highlight:: python .. code-block:: python
- with f:
f.delete(input_fn) …
This will call the pre-built reader to read files into an iterator of bytes and feed to the flow. One may also build a reader/generator on your own. Example, .. highlight:: python .. code-block:: python
- def my_reader():
- for _ in range(10):
yield b’abcdfeg’ # each yield generates a document to delete
- with f.build(runtime=’thread’) as flow:
flow.delete(bytes_gen=my_reader())
It will start a
CLIClient
and calldelete()
. :type input_fn:Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
] :param input_fn: An iterator of bytes. If not given, then you have to specify it in kwargs. :type on_done:Optional
[Callable
[…,None
]] :param on_done: the function to be called when theRequest
object is resolved. :type on_error:Optional
[Callable
[…,None
]] :param on_error: the function to be called when theRequest
object is rejected. :type on_always:Optional
[Callable
[…,None
]] :param on_always: the function to be called when theRequest
object is is either resolved or rejected. :param kwargs: accepts all keyword arguments of jina client CLI
-
search
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Do searching on the current flow It will start a
CLIClient
and callsearch()
. Example, .. highlight:: python .. code-block:: python- with f:
f.search(input_fn) …
This will call the pre-built reader to read files into an iterator of bytes and feed to the flow. One may also build a reader/generator on your own. Example, .. highlight:: python .. code-block:: python
- def my_reader():
- for _ in range(10):
yield b’abcdfeg’ # each yield generates a query for searching
- with f.build(runtime=’thread’) as flow:
flow.search(bytes_gen=my_reader())
- Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – An iterator of bytes. If not given, then you have to specify it in kwargs.on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
class
jina.
AsyncFlow
(args=None, env=None, **kwargs)[source]¶ Bases:
jina.flow.base.BaseFlow
AsyncFlow
is the asynchronous version of theFlow
. They share the same interface, except inAsyncFlow
train()
,index()
,search()
methods are coroutines (i.e. declared with the async/await syntax), simply calling them will not schedule them to be executed. To actually run a coroutine, user need to put them in an eventloop, e.g. viaasyncio.run()
,asyncio.create_task()
.AsyncFlow
can be very useful in the integration settings, where Jina/Jina flow is NOT the main logic, but rather served as a part of other program. In this case, users often do not want to let Jina control theasyncio.eventloop
. On contrary,Flow
is controlling and wrapping the eventloop internally, making the Flow looks synchronous from outside.In particular,
AsyncFlow
makes Jina usage in Jupyter Notebook more natural and reliable. For example, the following code will use the eventloop that already spawned in Jupyter/ipython to run Jina Flow (instead of creating a new one).from jina import AsyncFlow import numpy as np with AsyncFlow().add() as f: await f.index_ndarray(np.random.random([5, 4]), on_done=print)
Notice that the above code will NOT work in standard Python REPL, as only Jupyter/ipython implements “autoawait”.
See also
Asynchronous in REPL: Autoawait
https://ipython.readthedocs.io/en/stable/interactive/autoawait.html
Another example is when using Jina as an integration. Say you have another IO-bounded job
heavylifting()
, you can use this feature to schedule Jinaindex()
andheavylifting()
concurrently. For example,async def run_async_flow_5s(): # WaitDriver pause 5s makes total roundtrip ~5s with AsyncFlow().add(uses='- !WaitDriver {}') as f: await f.index_ndarray(np.random.random([5, 4]), on_done=validate) async def heavylifting(): # total roundtrip takes ~5s print('heavylifting other io-bound jobs, e.g. download, upload, file io') await asyncio.sleep(5) print('heavylifting done after 5s') async def concurrent_main(): # about 5s; but some dispatch cost, can't be just 5s, usually at <7s await asyncio.gather(run_async_flow_5s(), heavylifting())
One can think of
Flow
as Jina-managed eventloop, whereasAsyncFlow
is self-managed eventloop.Initialize a flow object
- Parameters
kwargs – other keyword arguments that will be shared by all pods in this flow
More explain on
optimize_level
:As an example, the following flow will generate 6 Peas,
f = Flow(optimize_level=FlowOptimizeLevel.NONE).add(uses='forward', parallel=3)
The optimized version, i.e.
Flow(optimize_level=FlowOptimizeLevel.FULL)
will generate 4 Peas, but it will force theGatewayPea
to take BIND role, as the head and tail routers are removed.-
train
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Do training on the current flow
It will start a
CLIClient
and calltrain()
.Example,
with f: f.train(input_fn) ...
This will call the pre-built reader to read files into an iterator of bytes and feed to the flow.
One may also build a reader/generator on your own.
Example,
def my_reader(): for _ in range(10): yield b'abcdfeg' # each yield generates a document for training with f.build(runtime='thread') as flow: flow.train(bytes_gen=my_reader())
- Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – An iterator of bytes. If not given, then you have to specify it in kwargs.on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
index_ndarray
(array, axis=0, size=None, shuffle=False, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Using numpy ndarray as the index source for the current flow
- Parameters
array (np.ndarray) – the numpy ndarray data source
axis (
int
) – iterate over that axissize (
int
) – the maximum number of the sub arraysshuffle (
bool
) – shuffle the the numpy data source beforehandon_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
search_ndarray
(array, axis=0, size=None, shuffle=False, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Use a numpy ndarray as the query source for searching on the current flow
- Parameters
array (np.ndarray) – the numpy ndarray data source
axis (
int
) – iterate over that axissize (
int
) – the maximum number of the sub arraysshuffle (
bool
) – shuffle the the numpy data source beforehandon_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
index_lines
(lines=None, filepath=None, size=None, sampling_rate=None, read_mode='r', on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Use a list of lines as the index source for indexing on the current flow
- Parameters
lines (
Optional
[Iterator
[str
]]) – a list of strings, each is considered as d documentfilepath (
Optional
[str
]) – a text file that each line contains a documentsize (
Optional
[int
]) – the maximum number of the documentssampling_rate (
Optional
[float
]) – the sampling rate between [0, 1]read_mode – specifies the mode in which the file is opened. ‘r’ for reading in text mode, ‘rb’ for reading in binary
on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
index_files
(patterns, recursive=True, size=None, sampling_rate=None, read_mode=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Use a set of files as the index source for indexing on the current flow
- Parameters
patterns (
Union
[str
,List
[str
]]) – The pattern may contain simple shell-style wildcards, e.g. ‘*.py’, ‘[*.zip, *.gz]’recursive (
bool
) – If recursive is true, the pattern ‘**’ will match any files and zero or more directories and subdirectories.size (
Optional
[int
]) – the maximum number of the filessampling_rate (
Optional
[float
]) – the sampling rate between [0, 1]read_mode (
Optional
[str
]) – specifies the mode in which the file is opened. ‘r’ for reading in text mode, ‘rb’ for reading in binary modeon_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
search_files
(patterns, recursive=True, size=None, sampling_rate=None, read_mode=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Use a set of files as the query source for searching on the current flow
- Parameters
patterns (
Union
[str
,List
[str
]]) – The pattern may contain simple shell-style wildcards, e.g. ‘*.py’, ‘[*.zip, *.gz]’recursive (
bool
) – If recursive is true, the pattern ‘**’ will match any files and zero or more directories and subdirectories.size (
Optional
[int
]) – the maximum number of the filessampling_rate (
Optional
[float
]) – the sampling rate between [0, 1]read_mode (
Optional
[str
]) – specifies the mode in which the file is opened. ‘r’ for reading in text mode, ‘rb’ for reading inon_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
search_lines
(lines=None, filepath=None, size=None, sampling_rate=None, read_mode='r', on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Use a list of files as the query source for searching on the current flow
- Parameters
filepath (
Optional
[str
]) – a text file that each line contains a documentlines (
Optional
[Iterator
[str
]]) – a list of strings, each is considered as d documentsize (
Optional
[int
]) – the maximum number of the documentssampling_rate (
Optional
[float
]) – the sampling rate between [0, 1]read_mode – specifies the mode in which the file is opened. ‘r’ for reading in text mode, ‘rb’ for reading in binary
on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
index
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Do indexing on the current flow
Example,
with f: f.index(input_fn) ...
This will call the pre-built reader to read files into an iterator of bytes and feed to the flow.
One may also build a reader/generator on your own.
Example,
def my_reader(): for _ in range(10): yield b'abcdfeg' # each yield generates a document to index with f.build(runtime='thread') as flow: flow.index(bytes_gen=my_reader())
It will start a
CLIClient
and callindex()
.- Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – An iterator of bytes. If not given, then you have to specify it in kwargs.on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
update
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Do updates on the current flow
Example,
with f: f.update(input_fn) ...
This will call the pre-built reader to read files into an iterator of bytes and feed to the flow.
One may also build a reader/generator on your own.
Example,
def my_reader(): for i in range(10): with Document() as doc: doc.text = '...' doc.id = i yield doc # each yield generates a query for updating with f.build(runtime='thread') as flow: flow.update(bytes_gen=my_reader())
It will start a
CLIClient
and callindex()
.- Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – An iterator of bytes. If not given, then you have to specify it in kwargs.on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
delete
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Do deletion on the current flow
Example,
with f: f.delete(input_fn) ...
This will call the pre-built reader to read files into an iterator of bytes and feed to the flow.
One may also build a reader/generator on your own.
Example,
def my_reader(): for i in range(10): with Document() as doc: doc.text = '...' doc.id = i yield doc # each yield generates a query for deletion with f.build(runtime='thread') as flow: flow.delete(bytes_gen=my_reader())
It will start a
CLIClient
and callindex()
.- Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – An iterator of bytes. If not given, then you have to specify it in kwargs.on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
search
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ Do searching on the current flow
It will start a
CLIClient
and callsearch()
.Example,
with f: f.search(input_fn) ...
This will call the pre-built reader to read files into an iterator of bytes and feed to the flow.
One may also build a reader/generator on your own.
Example,
def my_reader(): for _ in range(10): yield b'abcdfeg' # each yield generates a query for searching with f.build(runtime='thread') as flow: flow.search(bytes_gen=my_reader())
- Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – An iterator of bytes. If not given, then you have to specify it in kwargs.on_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs – accepts all keyword arguments of jina client CLI
-
class
jina.
Client
(args)[source]¶ Bases:
jina.clients.base.BaseClient
A simple Python client for connecting to the gRPC gateway. It manages the asyncio eventloop internally, so all interfaces are synchronous from the outside.
- Parameters
args (
Namespace
) – args provided by the CLI
-
train
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ - Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – the input function that generates the contenton_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs –
- Return type
None
- Returns
-
search
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ - Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – the input function that generates the contenton_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs –
- Return type
None
- Returns
-
index
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ - Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – the input function that generates the contenton_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs –
- Return type
None
- Returns
-
update
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ - Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – the input function that generates the contenton_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs –
- Return type
None
- Returns
-
delete
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ - Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – the input function that generates the contenton_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs –
- Return type
None
- Returns
-
class
jina.
AsyncClient
(args)[source]¶ Bases:
jina.clients.base.BaseClient
AsyncClient
is the asynchronous version of theClient
. They share the same interface, except inAsyncClient
train()
,index()
,search()
methods are coroutines (i.e. declared with the async/await syntax), simply calling them will not schedule them to be executed. To actually run a coroutine, user need to put them in an eventloop, e.g. viaasyncio.run()
,asyncio.create_task()
.AsyncClient
can be very useful in the integration settings, where Jina/Flow/Client is NOT the main logic, but rather served as a part of other program. In this case, users often do not want to let Jina control theasyncio.eventloop
. On contrary,Client
is controlling and wrapping the eventloop internally, making the Client looks synchronous from outside.For example, say you have the Flow running in remote. You want to use Client to connect to it do some index and search, but meanwhile you have some other IO-bounded jobs and want to do them concurrently. You can use
AsyncClient
,from jina.clients.asyncio import AsyncClient ac = AsyncClient(...) async def jina_client_query(): await ac.search(...) async def heavylifting(): await other_library.download_big_files(...) async def concurrent_main(): await asyncio.gather(jina_client_query(), heavylifting()) if __name__ == '__main__': # under python asyncio.run(concurrent_main())
One can think of
Client
as Jina-managed eventloop, whereasAsyncClient
is self-managed eventloop.- Parameters
args (
Namespace
) – args provided by the CLI
-
train
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ - Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – the input function that generates the contenton_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs –
- Return type
None
- Returns
-
search
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ - Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – the input function that generates the contenton_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs –
- Return type
None
- Returns
-
index
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ - Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – the input function that generates the contenton_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs –
- Return type
None
- Returns
-
delete
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ - Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – the input function that generates the contenton_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs –
- Return type
None
- Returns
-
update
(input_fn=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]¶ - Parameters
input_fn (
Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],Callable
[…,Union
[Iterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]],AsyncIterator
[Union
[~DocumentContentType, ~DocumentSourceType,Tuple
[~DocumentContentType, ~DocumentContentType],Tuple
[~DocumentSourceType, ~DocumentSourceType]]]]],None
]) – the input function that generates the contenton_done (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is resolved.on_error (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is rejected.on_always (
Optional
[Callable
[…,None
]]) – the function to be called when theRequest
object is is either resolved or rejected.kwargs –
- Return type
None
- Returns
-
jina.
Executor
¶ alias of
jina.executors.BaseExecutor
-
jina.
Classifier
¶
-
jina.
Crafter
¶ alias of
jina.executors.crafters.BaseCrafter
-
jina.
Encoder
¶ alias of
jina.executors.encoders.BaseEncoder
-
jina.
Evaluator
¶
-
jina.
Indexer
¶ alias of
jina.executors.indexers.BaseIndexer
-
jina.
Ranker
¶ alias of
jina.executors.rankers.BaseRanker
-
jina.
Segmenter
¶