jina.types.sets.document

class jina.types.sets.document.DocumentSet(docs_proto)[source]

Bases: collections.abc.MutableSequence

DocumentSet is a mutable sequence of Document, it gives an efficient view of a list of Document. One can iterate over it like a generator but ALSO modify it, count it, get item, or union two ‘DocumentSet’s using the ‘+’ and ‘+=’ operators.

insert(index, doc)[source]

S.insert(index, value) – insert value before index

Return type

None

append(doc)[source]

S.append(value) – append value to the end of the sequence

Return type

Document

add(doc)[source]

Shortcut to append(), do not override this method

Return type

Document

extend(iterable)[source]

S.extend(iterable) – extend sequence by appending elements from the iterable

Return type

None

clear() → None -- remove all items from S[source]
reverse()[source]

In-place reverse the sequence

build()[source]

Build a doc_id to doc mapping so one can later index a Document using doc_id as string key

sort(*args, **kwargs)[source]
traverse(traversal_paths, callback_fn, *args, **kwargs)[source]
property all_embeddings

Return all embeddings from every document in this set as a ndarray

:return a tuple of embedding in np.ndarray,

the corresponding documents in a DocumentSet, and the documents have no embedding in a DocumentSet.

Return type

Tuple[ndarray, DocumentSet, DocumentSet]

property all_contents

Return all embeddings from every document in this set as a ndarray

Return type

Tuple[ndarray, DocumentSet, DocumentSet]

Returns

a tuple of embedding in np.ndarray, the corresponding documents in a DocumentSet, and the documents have no contents in a DocumentSet.

new()[source]

Create a new empty document appended to the end of the set

Return type

Document