jina.types.document.multimodal¶
-
class
jina.types.document.multimodal.
MultimodalDocument
(document=None, chunks=None, modality_content_map=None, copy=False, **kwargs)[source]¶ Bases:
jina.types.document.Document
MultimodalDocument
is a data type created based on Jina primitive data typeDocument
.It shares the same methods and properties with
Document
, while it focus on modality at chunk level.Warning
It assumes that every
chunk
of adocument
belongs to a different modality.It assumes that every
MultimodalDocument
have at least two chunks.
- Parameters
document (
Optional
[~DocumentSourceType]) – the document to construct from. Ifbytes
is given then deserialize aDocumentProto
;dict
is given then parse aDocumentProto
from it;str
is given, then consider it as a JSON string and parse aDocumentProto
from it; finally, one can also give DocumentProto directly, then depending on thecopy
, it builds a view or a copy from it.chunks (
Optional
[Sequence
[Document
]]) – the chunks of the multimodal document to initialize with. Expected to received a list ofDocument
, with different modalities.copy (
bool
) – whendocument
is given as aDocumentProto
object, build a view (i.e. weak reference) from it or a deep copy from it.kwargs – other parameters to be set
- Param
modality_content_mapping: A Python dict, the keys are the modalities and the values are the
content
of theDocument
Warning
Build
MultimodalDocument
frommodality_content_mapping
expects you assignDocument.content
as the value of the dictionary.
-
property
is_valid
¶ A valid
MultimodalDocument
should meet the following requirements:Document should consist at least 2 chunks.
Length of modality is not identical to length of chunks.
- Return type
bool
-
property
modality_content_map
¶ Get the mapping of modality and content, the mapping is represented as a
dict
, the keys are the modalities of the chunks, the values are the corresponded content of the chunks.- Return type
Dict
- Returns
the mapping of modality and content extracted from chunks.
-
property
modalities
¶ Get all modalities of the
MultimodalDocument
.- Return type
List
[str
]- Returns
List of modalities extracted from chunks of the document.