jina.types.arrays.document module

class jina.types.arrays.document.DocumentArray(docs=None)[source]

Bases: jina.types.arrays.traversable.TraversableSequence, collections.abc.MutableSequence, jina.types.arrays.document.DocumentArrayGetAttrMixin, jina.types.arrays.neural_ops.DocumentArrayNeuralOpsMixin, jina.types.arrays.search_ops.DocumentArraySearchOpsMixin, collections.abc.Iterable

DocumentArray is a mutable sequence of Document. It gives an efficient view of a list of Document. One can iterate over it like a generator but ALSO modify it, count it, get item, or union two ‘DocumentArray’s using the ‘+’ and ‘+=’ operators.

It is supposed to act as a view containing a pointer to a RepeatedContainer of DocumentProto while offering Document Jina native types when getting items or iterating over it

Parameters

docs (Optional[~DocumentArraySourceType]) – the document array to construct from. One can also give DocumentArrayProto directly, then depending on the copy, it builds a view or a copy from it. It also can accept a List

insert(index, doc)[source]

Insert :param:`doc.proto` at :param:`index` into the list of :class:`DocumentArray .

Parameters
  • index (int) – Position of the insertion.

  • doc (Document) – The doc needs to be inserted.

Return type

None

append(doc)[source]

Append :param:`doc` in DocumentArray.

Parameters

doc (Document) – The doc needs to be appended.

extend(iterable)[source]

Extend the DocumentArray by appending all the items from the iterable.

Parameters

iterable (Iterable[Document]) – the iterable of Documents to extend this array with

Return type

None

clear()[source]

Clear the data of DocumentArray

reverse()[source]

In-place reverse the sequence.

sort(key=None, *args, **kwargs)[source]

Sort the items of the DocumentArray in place.

Parameters
  • key – key callable to sort based upon

  • args – variable set of arguments to pass to the sorting underlying function

  • kwargs – keyword arguments to pass to the sorting underlying function

save(file, file_format='json')[source]

Save array elements into a JSON or a binary file.

Parameters
  • file (Union[str, TextIO, BinaryIO]) – File or filename to which the data is saved.

  • file_format (str) – json or binary. JSON file is human-readable, but binary format gives much smaller size and faster save/load speed.

Return type

None

classmethod load(file, file_format='json')[source]

Load array elements from a JSON or a binary file.

Parameters
  • file (Union[str, TextIO, BinaryIO]) – File or filename to which the data is saved.

  • file_format (str) – json or binary. JSON file is human-readable, but binary format gives much smaller size and faster save/load speed.

Return type

DocumentArray

Returns

the loaded DocumentArray object

save_binary(file)[source]

Save array elements into a binary file.

Comparing to save_json(), it is faster and the file is smaller, but not human-readable.

Parameters

file (Union[str, BinaryIO]) – File or filename to which the data is saved.

Return type

None

save_json(file)[source]

Save array elements into a JSON file.

Comparing to save_binary(), it is human-readable but slower to save/load and the file size larger.

Parameters

file (Union[str, TextIO]) – File or filename to which the data is saved.

Return type

None

classmethod load_json(file)[source]

Load array elements from a JSON file.

Parameters

file (Union[str, TextIO]) – File or filename to which the data is saved.

Return type

DocumentArray

Returns

a DocumentArray object

classmethod load_binary(file)[source]

Load array elements from a binary file.

Parameters

file (Union[str, BinaryIO]) – File or filename to which the data is saved.

Return type

DocumentArray

Returns

a DocumentArray object

class jina.types.arrays.document.DocumentArrayGetAttrMixin[source]

Bases: object

A mixin that provides attributes getter in bulk

get_attributes(*fields)[source]

Return all nonempty values of the fields from all docs this array contains

Parameters

fields (str) – Variable length argument with the name of the fields to extract

Return type

Union[List, List[List]]

Returns

Returns a list of the values for these fields. When fields has multiple values, then it returns a list of list.

get_attributes_with_docs(*fields)[source]

Return all nonempty values of the fields together with their nonempty docs

Parameters

fields (str) – Variable length argument with the name of the fields to extract

Return type

Tuple[Union[List, List[List]], DocumentArray]

Returns

Returns a tuple. The first element is a list of the values for these fields. When fields has multiple values, then it returns a list of list. The second element is the non-empty docs.