jina.flow.mixin.crud

class jina.flow.mixin.crud.CRUDFlowMixin[source]

Bases: object

The synchronous version of the Mixin for CRUD in Flow

train(inputs, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Do training on the current Flow

Parameters
  • inputs (Union[Document, Iterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], AsyncIterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], Callable[…, Union[Document, Iterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], AsyncIterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]]]]]) – An iterator of bytes. If not given, then you have to specify it in kwargs.

  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results

index_ndarray(array, axis=0, size=None, shuffle=False, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Using numpy ndarray as the index source for the current Flow

Parameters
  • array (ndarray) – the numpy ndarray data source

  • axis (int) – iterate over that axis

  • size (Optional[int]) – the maximum number of the sub arrays

  • shuffle (bool) – shuffle the the numpy data source beforehand

  • on_done (Optional[Callable[…, None]]) – the callback function to invoke after indexing

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results

search_ndarray(array, axis=0, size=None, shuffle=False, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Use a numpy ndarray as the query source for searching on the current Flow

Parameters
  • array (ndarray) – the numpy ndarray data source

  • axis (int) – iterate over that axis

  • size (Optional[int]) – the maximum number of the sub arrays

  • shuffle (bool) – shuffle the the numpy data source beforehand

  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

index_lines(lines=None, filepath=None, size=None, sampling_rate=None, read_mode='r', line_format='json', field_resolver=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Use a list of lines as the index source for indexing on the current Flow :type lines: Union[Iterable[str], TextIO, None] :param lines: a list of strings, each is considered as d document :type filepath: Optional[str] :param filepath: a text file that each line contains a document :type size: Optional[int] :param size: the maximum number of the documents :type sampling_rate: Optional[float] :param sampling_rate: the sampling rate between [0, 1] :type read_mode: str :param read_mode: specifies the mode in which the file

is opened. ‘r’ for reading in text mode, ‘rb’ for reading in binary

Parameters
  • line_format (str) – the format of each line: json or csv

  • field_resolver (Optional[Dict[str, str]]) – a map from field names defined in document (JSON, dict) to the field names defined in Protobuf. This is only used when the given document is a JSON string or a Python dict.

  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results

index_ndjson(lines, field_resolver=None, size=None, sampling_rate=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Use a list of lines as the index source for indexing on the current Flow :type lines: Union[Iterable[str], TextIO] :param lines: a list of strings, each is considered as d document :type size: Optional[int] :param size: the maximum number of the documents :type sampling_rate: Optional[float] :param sampling_rate: the sampling rate between [0, 1] :type field_resolver: Optional[Dict[str, str]] :param field_resolver: a map from field names defined in document (JSON, dict) to the field

names defined in Protobuf. This is only used when the given document is a JSON string or a Python dict.

Parameters
  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results

index_csv(lines, field_resolver=None, size=None, sampling_rate=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Use a list of lines as the index source for indexing on the current Flow :type lines: Union[Iterable[str], TextIO] :param lines: a list of strings, each is considered as d document :type size: Optional[int] :param size: the maximum number of the documents :type sampling_rate: Optional[float] :param sampling_rate: the sampling rate between [0, 1] :type field_resolver: Optional[Dict[str, str]] :param field_resolver: a map from field names defined in document (JSON, dict) to the field

names defined in Protobuf. This is only used when the given document is a JSON string or a Python dict.

Parameters
  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results

search_csv(lines, field_resolver=None, size=None, sampling_rate=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Use a list of lines as the index source for indexing on the current Flow :type lines: Union[Iterable[str], TextIO] :param lines: a list of strings, each is considered as d document :type size: Optional[int] :param size: the maximum number of the documents :type sampling_rate: Optional[float] :param sampling_rate: the sampling rate between [0, 1] :type field_resolver: Optional[Dict[str, str]] :param field_resolver: a map from field names defined in document (JSON, dict) to the field

names defined in Protobuf. This is only used when the given document is a JSON string or a Python dict.

Parameters
  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results

index_files(patterns, recursive=True, size=None, sampling_rate=None, read_mode=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Use a set of files as the index source for indexing on the current Flow :type patterns: Union[str, Iterable[str]] :param patterns: The pattern may contain simple shell-style wildcards, e.g. ‘*.py’, ‘[*.zip, *.gz]’ :type recursive: bool :param recursive: If recursive is true, the pattern ‘**’ will match any files and

zero or more directories and subdirectories.

Parameters
  • size (Optional[int]) – the maximum number of the files

  • sampling_rate (Optional[float]) – the sampling rate between [0, 1]

  • read_mode (Optional[str]) – specifies the mode in which the file is opened. ‘r’ for reading in text mode, ‘rb’ for reading in binary mode

  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results

search_files(patterns, recursive=True, size=None, sampling_rate=None, read_mode=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Use a set of files as the query source for searching on the current Flow :type patterns: Union[str, Iterable[str]] :param patterns: The pattern may contain simple shell-style wildcards, e.g. ‘*.py’, ‘[*.zip, *.gz]’ :type recursive: bool :param recursive: If recursive is true, the pattern ‘**’ will match any files and

zero or more directories and subdirectories.

Parameters
  • size (Optional[int]) – the maximum number of the files

  • sampling_rate (Optional[float]) – the sampling rate between [0, 1]

  • read_mode (Optional[str]) – specifies the mode in which the file is opened. ‘r’ for reading in text mode, ‘rb’ for reading in

  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results

search_lines(lines=None, filepath=None, field_resolver=None, size=None, sampling_rate=None, read_mode='r', line_format='json', on_done=None, on_error=None, on_always=None, **kwargs)[source]

Use a list of files as the query source for searching on the current Flow :type filepath: Optional[str] :param filepath: a text file that each line contains a document :type lines: Union[Iterable[str], TextIO, None] :param lines: a list of strings, each is considered as d document :type size: Optional[int] :param size: the maximum number of the documents :type sampling_rate: Optional[float] :param sampling_rate: the sampling rate between [0, 1] :type read_mode: str :param read_mode: specifies the mode in which the file

is opened. ‘r’ for reading in text mode, ‘rb’ for reading in binary

Parameters
  • line_format (str) – the format of each line json or csv

  • field_resolver (Optional[Dict[str, str]]) – a map from field names defined in document (JSON, dict) to the field names defined in Protobuf. This is only used when the given document is a JSON string or a Python dict.

  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results

search_ndjson(lines, field_resolver=None, size=None, sampling_rate=None, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Use a list of files as the query source for searching on the current Flow :type lines: Union[Iterable[str], TextIO] :param lines: a list of strings, each is considered as d document :type size: Optional[int] :param size: the maximum number of the documents :type sampling_rate: Optional[float] :param sampling_rate: the sampling rate between [0, 1] :type field_resolver: Optional[Dict[str, str]] :param field_resolver: a map from field names defined in document (JSON, dict) to the field

names defined in Protobuf. This is only used when the given document is a JSON string or a Python dict.

Parameters
  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results

index(inputs, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Do indexing on the current Flow :type inputs: Union[Document, Iterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], AsyncIterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], Callable[…, Union[Document, Iterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], AsyncIterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]]]]] :param inputs: An iterator of bytes. If not given, then you have to specify it in kwargs. :type on_done: Optional[Callable[…, None]] :param on_done: the function to be called when the Request object is resolved. :type on_error: Optional[Callable[…, None]] :param on_error: the function to be called when the Request object is rejected. :type on_always: Optional[Callable[…, None]] :param on_always: the function to be called when the Request object is is either resolved or rejected. :param kwargs: accepts all keyword arguments of jina client CLI :return: results

update(inputs, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Updates Documents on the current Flow

Parameters
  • inputs (Union[Document, Iterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], AsyncIterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], Callable[…, Union[Document, Iterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], AsyncIterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]]]]]) – An iterator of bytes. If not given, then you have to specify it in kwargs.

  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

delete(ids, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Do deletion on the current Flow

Parameters
  • ids (Union[str, Iterable[str], Callable[…, Iterable[str]]]) – An iterator of bytes. If not given, then you have to specify it in kwargs.

  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

search(inputs, on_done=None, on_error=None, on_always=None, **kwargs)[source]

Do searching on the current Flow It will start a CLIClient and call search().

Parameters
  • inputs (Union[Document, Iterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], AsyncIterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], Callable[…, Union[Document, Iterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]], AsyncIterable[Union[~DocumentContentType, ~DocumentSourceType, Document, Tuple[~DocumentContentType, ~DocumentContentType], Tuple[~DocumentSourceType, ~DocumentSourceType]]]]]]) – An iterator of bytes. If not given, then you have to specify it in kwargs.

  • on_done (Optional[Callable[…, None]]) – the function to be called when the Request object is resolved.

  • on_error (Optional[Callable[…, None]]) – the function to be called when the Request object is rejected.

  • on_always (Optional[Callable[…, None]]) – the function to be called when the Request object is is either resolved or rejected.

  • kwargs – accepts all keyword arguments of jina client CLI

Returns

results