jina.executors.indexers.cache

Indexer for caching.

class jina.executors.indexers.cache.BaseCache(*args, **kwargs)[source]

Bases: jina.executors.indexers.BaseKVIndexer

Base class of the cache inherited BaseKVIndexer.

The difference between a cache and a BaseKVIndexer is the handler_mutex is released in cache, this allows one to query-while-indexing.

Parameters
  • args – additional positional arguments which are just used for the parent initialization

  • kwargs – additional key value arguments which are just used for the parent initialization

post_init()[source]

For Cache we need to release the handler mutex to allow RW at the same time.

class jina.executors.indexers.cache.DocCache(index_filename=None, fields=None, *args, **kwargs)[source]

Bases: jina.executors.indexers.cache.BaseCache

A key-value indexer that specializes in caching.

Serializes the cache to two files, one for ids, one for the actually cached field. If fields=[“id”], then the second file is redundant. The class optimizes the process so that there are no duplicates.

Order of fields does NOT affect the caching.

Parameters
  • index_filename (Optional[str]) – file name for storing the cache data

  • fields (Union[str, Tuple[str], None]) – fields to cache on (of Document)

  • args – additional positional arguments which are just used for the parent initialization

  • kwargs – additional key value arguments which are just used for the parent initialization

class CacheHandler(path, logger)[source]

Bases: object

A handler for loading and serializing the in-memory cache of the DocCache.

Parameters
  • path – Path to the file from which to build the actual paths.

  • logger – Instance of logger.

close()[source]

Flushes the in-memory cache to pickle files.

default_fields = ('id',)
add(keys, values, *args, **kwargs)[source]

Add a document to the cache depending.

Parameters
  • keys (Iterable[str]) – document ids to be added

  • values (Iterable[bytes]) – document cache values to be added

  • args – not used

  • kwargs – not used

Return type

None

query(key, *args, **kwargs)[source]

Check whether the data exists in the cache.

Parameters
  • key (str) – the value that we cached by (combination of the Document fields)

  • args – not used

  • kwargs – not used

Return type

bool

Returns

status

update(keys, values, *args, **kwargs)[source]

Update cached documents.

Parameters
  • keys (Iterable[str]) – list of Document.id

  • values (Iterable[bytes]) – list of values (combination of the Document fields)

  • args – not used

  • kwargs – not used

Return type

None

delete(keys, *args, **kwargs)[source]

Delete documents from the cache.

Parameters
  • keys (Iterable[str]) – list of Document.id

  • args – not used

  • kwargs – not used

Return type

None

get_add_handler()[source]

Get the CacheHandler.

get_query_handler()[source]

Get the CacheHandler.

Return type

CacheHandler

get_create_handler()[source]

Get the CacheHandler.