Monitor#
A Jina Flow
exposes several core metrics that allow you to have a deeper look
at what is happening inside it. Metrics allow you to, for example, monitor the overall performance
of your Flow, detect bottlenecks, or alert your team when some component of your Flow is down.
Jina Flows expose metrics in the Prometheus format. This is a plain text format that is understandable by both humans and machines. These metrics are intended to be scraped by Prometheus, an industry-standard tool for collecting and monitoring metrics.
To visualize your metrics through a dashboard, we recommend Grafana.
Hint
Depending on your deployment type (local, Kubernetes or JCloud), you need to ensure a running Prometheus/Grafana stack. Check the Flow and monitoring stack deployment section to find out how to provision your monitoring stack.
Enable monitoring#
A Flow
is composed of several Pods, namely the GatewayRuntime
, the Executor
s, and potentially a HeadRuntime
(see the architecture overview for more details). Each of these Pods is its own microservice. These services expose their own metrics using the Prometheus client.
This means that they are as many metrics endpoints as there are Pods in your Flow.
Let’s give an example to illustrate it :
This example shows how to start a Flow with monitoring enabled via the Python API:
from jina import Flow
with Flow(monitoring=True, port_monitoring=9090).add(
uses='jinahub://SimpleIndexer', port_monitoring=9091
) as f:
f.block()
This example shows how to start a Flow with monitoring enabled via yaml:
In a flow.yaml
file
jtype: Flow
with:
monitoring: true
port_monitoring: 9090
executors:
- uses: jinahub://SimpleIndexer
port_monitoring: 9091
jina flow --uses flow.yaml
This Flow will create two Pods, one for the Gateway, and one for the SimpleIndexer Executor, therefore it will create two metrics endpoints:
http://localhost:9090
for the gatewayhttp://localhost:9091
for the SimpleIndexer
Change the default monitoring port
When Jina is used locally, all of the port_monitoring
will be random by default (within the range [49152, 65535]). However we
strongly encourage you to precise these ports for the Gateway and for all of the Executors. Otherwise it will change at
restart and you will have to change your Prometheus configuration file.
Because each Pod in a Flow exposes its own metrics, the monitoring feature can be used independently on each Pod. This means that you are not forced to always monitor every Pod of your Flow. For example, you could be only interested in metrics coming from the Gateway, and therefore you only activate the monitoring on it. On the other hand, you might be only interested in monitoring a single Executor. Note that by default the monitoring is disabled everywhere.
To enable the monitoring you need to pass monitoring = True
when creating the Flow.
Flow(monitoring=True).add(...)
Enabling Flow
Passing monitoring = True
when creating the Flow will enable the monitoring on all the Pods of your Flow.
If you want to enable the monitoring only on the Gateway, you need to first enable the feature for the entire Flow, and then disable it for the Executor which you are not interested in.
Flow(monitoring=True).add(monitoring=False, ...).add(monitoring=False, ...)
On the other hand, If you want to only enable the monitoring on a given Executor you should do:
Flow().add(...).add(uses=MyExecutor, monitoring=True)
Enable monitoring with replicas and shards#
Tip
This section is only relevant if you deploy your Flow natively. When deploying your Flow with Kubernetes or Docker Compose
all of the port_monitoring
will be set to default : 9090
.
To enable monitoring with replicas and shards when deploying natively, you need to pass a list of port_monitoring
separated by a comma to your Flow.
Example:
from jina import Flow
with Flow(monitoring=True).add(
uses='jinahub://SimpleIndexer', replicas=2, port_monitoring='9091,9092'
) as f:
f.block()
This example shows how to start a Flow with monitoring enabled via yaml:
In a flow.yaml
file
jtype: Flow
with:
monitoring: true
executors:
- uses: jinahub://SimpleIndexer
replicas=2
port_monitoring: '9091,9092'
jina flow --uses flow.yaml
Tip
Monitoring with shards
When using shards, an extra head will be created and you will need to pass a list of N+1 ports to port_monitoring
, N beeing the number of shards you desire
If you precise fewer port_monitoring
than you have replicas of your Executor (or even not passing any at all), the unknown ports
will be assigned randomly. It is a better practice to precise a port for every replica, otherwise you will have to change
your Prometheus configuration each time you restart your application.
Available metrics#
A Flow
supports different metrics out of the box, in addition to allowing the user to define their own custom metrics.
Because not all Pods have the same role, they expose different kinds of metrics:
Gateway Pods#
Metrics name |
Metrics type |
Description |
---|---|---|
|
Measures the time elapsed between receiving a request from the client and sending back the response. |
|
|
Measures the time elapsed between sending a downstream request to an Executor/Head and receiving the response back. |
|
|
Counts the number of pending requests |
|
|
Counts the number of successful requests returned by the gateway |
|
|
Counts the number of failed requests returned by the gateway |
See also
You can find more information on the different type of metrics in Prometheus here
Head Pods#
Metrics name |
Metrics type |
Description |
---|---|---|
|
Measure the time elapsed between receiving a request from the gateway and sending back the response. |
|
|
Measure the time elapsed between sending a downstream request to an Executor and receiving the response back. |
Executor Pods#
Metrics name |
Metrics type |
Description |
---|---|---|
|
Measure the time elapsed between receiving a request from the gateway (or the head) and sending back the response. |
|
|
Measure the time spend calling the requested method |
|
|
Counts the number of Documents processed by an Executor |
|
|
Total count of successful requests returned by the Executor across all endpoints |
|
|
Total count of failed requests returned by the Executor across all endpoints |
|
|
Measures the size of the requests in Bytes |
See also
Beyond monitoring every endpoint of an Executor you can define custom metricsfor you Executor.
Hint
jina_receiving_request_seconds
is different from jina_process_request_seconds
because it includes the gRPC communication overhead whereas jina_process_request_seconds
is only about the time spend calling the function