Instrumentation#

Instrumentation consists of OpenTelemetry Tracing and Metrics. Each feature can be enabled independently, and they allow you to collect request-level and application-level metrics for analyzing an Executor’s real-time behavior.

Full details on Instrumentation

This section describes custom tracing spans. To use the Executor’s default tracing, refer to Flow Instrumentation.

Hint

Read more on setting up an OpenTelemetry collector backend in the OpenTelemetry Setup section.

Caution

Prometheus-only based metrics collection will soon be deprecated. Refer to Monitoring Executor for this deprecated setup.

Tracing#

Any method that uses the requests decorator adds a default tracing span for the defined operation. In addition, the operation span context is propagated to the method for creating further user-defined child spans within the method.

You can create custom spans to observe the operation’s individual steps or record details and attributes with finer granularity. When tracing is enabled, Jina provides the OpenTelemetry Tracer implementation as an Executor class attribute that you can use to create new child spans. The tracing_context method argument contains the parent span context using which a new span can be created to trace the desired operation in the method.

If tracing is enabled, each Executor exports its traces to the configured exporter host via the Span Exporter. The backend combines these traces for visualization and alerting.

Create custom traces#

A request method is the public method that exposes an operation as an API. Depending on complexity, the method can be composed of different sub-operations that are required to build the final response.

You can record/observe each internal step (along with its global or request-specific attributes) to give a finer-grained view of the operation at the request level. This helps identify bottlenecks and isolate request patterns that cause service degradation or errors.

You can use the self.tracer class attribute to create a new child span using the tracing_context method argument:

from jina import Executor, requests
from docarray import DocList
from docarray.documents import TextDoc


class MyExecutor(Executor):
    @requests
    def foo(self, docs: DocList[TextDoc], tracing_context, **kwargs) -> DocList[TextDoc]:
        with self.tracer.start_as_current_span(
            'process_docs', context=tracing_context
        ) as process_span:
            process_span.set_attribute('sampling_rate', 0.01)
            docs = process(docs)
            with self.tracer.start_as_current_span('update_docs') as update_span:
                try:
                    update_span.set_attribute('len_updated_docs', len(docs))
                    docs = update(docs)
                except Exception as ex:
                    update_span.set_status(Status(StatusCode.ERROR))
                    update_span.record_exception(ex)
        return docs

The above pieces of instrumentation generate three spans:

  1. Default span with name foo for the overall method.

  2. process_span that measures the process and update sub-operations along with a sampling_rate attribute that is either a constant or specific to the request/operation.

  3. update_span that measures the updated operation along with any exceptions that might arise during the operation. The exception is recorded and marked on the update_span span. Since the exception is swallowed, the request succeeds with successful parent spans.

Respect OpenTelemetry Tracing semantic conventions

You should respect OpenTelemetry Tracing semantic conventions.

Hint

If tracing is not enabled by default or enabled in your environment, check self.tracer exists before usage. If metrics are disabled then self.tracer will be None.

Metrics#

Hint

Prometheus-only based metrics collection will be deprecated soon. Refer to Monitoring Executor section for the deprecated setup.

Any method that uses the requests decorator is monitored and creates a histogram which tracks the method’s execution time.

This section documents adding custom monitoring to the Executor with the OpenTelemetry Metrics API.

Custom metrics are useful to monitor each sub-part of your Executor(s). Jina lets you leverage the Meter to define useful metrics for each of your Executors. We also provide a convenient wrapper, (monitor()), which lets you monitor your Executor’s sub-methods.

When metrics are enabled, each Executor exposes its own metrics via the Metric Exporter.

Define custom metrics#

Sometimes monitoring the encoding method is not enough - you need to break it up into multiple parts to monitor one by one.

This is useful if your encoding phase is composed of two tasks, like image processing and image embedding. By using custom metrics on these two tasks you can identify potential bottlenecks.

Overall, adding custom metrics gives you full flexibility when monitoring your Executor.

Use context manager#

Use self.monitor to monitor your function’s internal blocks:

from jina import Executor, requests
from docarray import DocList
from docarray.documents import TextDoc


class MyExecutor(Executor):
    @requests
    def foo(self, docs: DocList[TextDoc], **kwargs) -> DocList[TextDoc]:
        with self.monitor('processing_seconds', 'Time processing my document'):
            docs = process(docs)
        print(docs.texts)
        with self.monitor('update_seconds', 'Time updates my document'):
            docs = update(docs)
        return docs

Use the @monitor decorator#

Add custom monitoring to a method with the monitor() decorator:

from jina import Executor, monitor


class MyExecutor(Executor):
    @monitor()
    def my_method(self):
        ...

This creates a Histogram jina_my_method_seconds which tracks the execution time of my_method

By default, the name and documentation of the metric created by monitor() are auto-generated based on the function’s name. To set a custom name:

@monitor(
    name='my_custom_metrics_seconds', documentation='This is my custom documentation'
)
def method(self):
    ...

respect OpenTelemetry Metrics semantic conventions

You should respect OpenTelemetry Metrics semantic conventions.

Use OpenTelemetry Meter#

Under the hood, Python OpenTelemetry Metrics API handles the Executor’s metrics feature. The monitor() decorator is convenient for monitoring an Executor’s sub-methods, but if you need more flexibility, use the self.meter Executor class attribute to create supported instruments:

from jina import requests, Executor
from docarray import DocList
from docarray.documents import TextDoc


class MyExecutor(Executor):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.counter = self.meter.create_counter('my_count', 'my count')

    @requests
    def encode(self, docs: DocList[TextDoc], **kwargs) -> DocList[TextDoc]:
        self.counter.inc(len(docs))

This creates a Counter that you can use to incrementally track the number of Documents received in each request.

Hint

If metrics are not enabled by default or enabled in your environment, you should check self.meter and self.counter exists before usage. If metrics are disabled then self.meter will be None.

Example#

from jina import requests, Executor
from docarray import DocList
from docarray.documents.legacy import LegacyDocument


class MyExecutor(Executor):
    def preprocessing(self, docs: DocList[LegacyDocument]):
        ...

    def model_inference(self, tensor):
        ...

    @requests
    def encode(self, docs: DocList[LegacyDocument], **kwargs) -> DocList[LegacyDocument]:
        docs.tensors = self.preprocessing(docs)
        docs.embedding = self.model_inference(docs.tensors)

The encode function is composed of two sub-functions.

  • preprocessing takes raw bytes from a DocList and puts them into a PyTorch tensor.

  • model inference calls the forward function of a deep learning model.

By default, only the encode function is monitored:

from jina import Executor, requests, monitor
from docarray import DocList
from docarray.documents.legacy import LegacyDocument

class MyExecutor(Executor):

    @monitor()
    def preprocessing(self, docs: DocList[LegacyDocument]):
        ...

    @monitor()
    def model_inference(self, tensor):
        ...

    @requests
    def encode(self, docs: DocList[LegacyDocument], **kwargs) -> DocList[LegacyDocument]:
        docs.tensors = self.preprocessing(docs)
        docs.embedding = self.model_inference(docs.tensors)
from jina import Executor, requests
from docarray import DocList
from docarray.documents.legacy import LegacyDocument

def preprocessing(self, docs: DocList[LegacyDocument]):
    ...

def model_inference(self, tensor):
    ...

class MyExecutor(Executor):

    @requests
    def encode(self, docs: DocList[LegacyDocument], **kwargs) -> DocList[LegacyDocument]:
        with self.monitor('preprocessing_seconds', 'Time preprocessing the requests'):
            docs.tensors = preprocessing(docs)
        with self.monitor('model_inference_seconds', 'Time doing inference the requests'):
            docs.embedding = model_inference(docs.tensors)

See also#