We have a Youtube video explaining this demo in detail.
This demo requires extra dependencies. Please install them via:
pip install "jina[demo]"
A multimodal-document contains multiple data types, e.g. a PDF document often contains figures and text. Jina lets you build a multimodal search solution in just minutes. To run our minimum multimodal document search demo:
jina hello multimodal
This downloads the people image dataset and tells Jina to index 2,000 image-caption pairs with MobileNet and MPNet. The indexing process takes about three minutes on CPU. Then it opens a web page where you can query multimodal documents. We have prepared a YouTube tutorial to walk you through this demo.