Instrumentation consists of OpenTelemetry Tracing and Metrics. Each feature can be enabled independently, and they allow you to collect request-level and application-level metrics for analyzing an Executor’s real-time behavior.
Full details on Instrumentation
This section describes custom tracing spans. To use the Executor’s default tracing, refer to Flow Instrumentation.
Read more on setting up an OpenTelemetry collector backend in the OpenTelemetry Setup section.
Prometheus-only based metrics collection will soon be deprecated. Refer to Monitoring Executor for this deprecated setup.
Any method that uses the
requests decorator adds a
default tracing span for the defined operation. In addition, the operation span context
is propagated to the method for creating further user-defined child spans within the
You can create custom spans to observe the operation’s individual steps or record details and attributes with finer granularity. When tracing is enabled, Jina provides the OpenTelemetry Tracer implementation as an Executor class attribute that you can use to create new child spans. The
tracing_context method argument contains the parent span context using which a new span can be created to trace the desired operation in the method.
If tracing is enabled, each Executor exports its traces to the configured exporter host via the Span Exporter. The backend combines these traces for visualization and alerting.
Create custom traces#
request method is the public method that exposes an operation as an API. Depending on complexity, the method can be composed of different sub-operations that are required to build the final response.
You can record/observe each internal step (along with its global or request-specific attributes) to give a finer-grained view of the operation at the request level. This helps identify bottlenecks and isolate request patterns that cause service degradation or errors.
You can use the
self.tracer class attribute to create a new child span using the
tracing_context method argument:
from jina import Executor, requests class MyExecutor(Executor): @requests def foo(self, docs, tracing_context, **kwargs): with self.tracer.start_as_current_span( 'process_docs', context=tracing_context ) as process_span: process_span.set_attribute('sampling_rate', 0.01) docs = process(docs) with self.tracer.start_as_current_span('update_docs') as update_span: try: update_span.set_attribute('len_updated_docs', len(docs)) docs = update(docs) except Exception as ex: update_span.set_status(Status(StatusCode.ERROR)) update_span.record_exception(ex)
The above pieces of instrumentation generate three spans:
Default span with name
foofor the overall method.
process_spanthat measures the
updatesub-operations along with a
sampling_rateattribute that is either a constant or specific to the request/operation.
update_spanthat measures the
updatedoperation along with any exceptions that might arise during the operation. The exception is recorded and marked on the
update_spanspan. Since the exception is swallowed, the request succeeds with successful parent spans.
Respect OpenTelemetry Tracing semantic conventions
You should respect OpenTelemetry Tracing semantic conventions.
If tracing is not enabled by default or enabled in your environment, check
self.tracer exists before usage. If metrics are disabled then
self.tracer will be
Prometheus-only based metrics collection will be deprecated soon. Refer to Monitoring Executor section for the deprecated setup.
Any method that uses the
requests decorator is monitored and creates a
histogram which tracks the method’s execution time.
This section documents adding custom monitoring to the
Executor with the OpenTelemetry Metrics API.
Custom metrics are useful to monitor each sub-part of your Executor(s). Jina lets you leverage
the Meter to define useful metrics
for each of your Executors. We also provide a convenient wrapper, (
monitor()), which lets you monitor
your Executor’s sub-methods.
When metrics are enabled, each Executor exposes its own metrics via the Metric Exporter.
Define custom metrics#
Sometimes monitoring the
encoding method is not enough - you need to break it up into multiple parts to monitor one by one.
This is useful if your encoding phase is composed of two tasks, like image processing and image embedding. By using custom metrics on these two tasks you can identify potential bottlenecks.
Overall, adding custom metrics gives you full flexibility when monitoring your Executor.
Use context manager#
self.monitor to monitor your function’s internal blocks:
from jina import Executor, requests class MyExecutor(Executor): @requests def foo(self, docs, **kwargs): with self.monitor('processing_seconds', 'Time processing my document'): docs = process(docs) print(docs.texts) with self.monitor('update_seconds', 'Time updates my document'): docs = update(docs)
Add custom monitoring to a method with the
from jina import Executor, monitor class MyExecutor(Executor): @monitor() def my_method(self): ...
This creates a Histogram
jina_my_method_seconds which tracks the execution time of
By default, the name and documentation of the metric created by
monitor() are auto-generated based on the function’s name.
To set a custom name:
@monitor( name='my_custom_metrics_seconds', documentation='This is my custom documentation' ) def method(self): ...
respect OpenTelemetry Metrics semantic conventions
You should respect OpenTelemetry Metrics semantic conventions.
Use OpenTelemetry Meter#
Under the hood, Python OpenTelemetry Metrics API handles the Executor’s metrics feature. The
monitor() decorator is convenient for monitoring an Executor’s sub-methods, but if you need more flexibility, use the
self.meter Executor class attribute to create supported instruments:
from jina import requests, Executor, DocumentArray from prometheus_client import Counter class MyExecutor(Executor): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.counter = self.meter.create_counter('my_count', 'my count') @requests def encode(self, docs: DocumentArray, **kwargs): self.counter.inc(len(docs))
This creates a Counter that you can use to incrementally track the number of Documents received in each request.
If metrics are not enabled by default or enabled in your environment, you should check
self.counter exists before usage. If metrics are disabled then
self.meter will be
from jina import requests, Executor, DocumentArray class MyExecutor(Executor): def preprocessing(self, docs: DocumentArray): ... def model_inference(self, tensor): ... @requests def encode(self, docs: DocumentArray, **kwargs): docs.tensors = self.preprocessing(docs) docs.embedding = self.model_inference(docs.tensors)
encode function is composed of two sub-functions.
preprocessingtakes raw bytes from a DocumentArray and puts them into a PyTorch tensor.
model inferencecalls the forward function of a deep learning model.
By default, only the
encode function is monitored:
from jina import requests, monitor, Executor, DocumentArray class MyExecutor(Executor): @monitor() def preprocessing(self, docs: DocumentArray): ... @monitor() def model_inference(self, tensor): ... @requests def encode(self, docs: DocumentArray, **kwargs): docs.tensors = self.preprocessing(docs) docs.embedding = self.model_inference(docs.tensors)
from jina import requests, Executor, DocumentArray def preprocessing(self, docs: DocumentArray): ... def model_inference(self, tensor): ... class MyExecutor(Executor): @requests def encode(self, docs: DocumentArray, **kwargs): with self.monitor('preprocessing_seconds', 'Time preprocessing the requests'): docs.tensors = preprocessing(docs) with self.monitor('model_inference_seconds', 'Time doing inference the requests'): docs.embedding = model_inference(docs.tensors)