Metrics
Every backup pod offers metrics about the current state of the backup.
The metrics are available on the /metrics endpoint of the pods on port 9000 and are offered in the Prometheus format.
Available metrics
| Name | Type | Description |
|---|---|---|
| backup_sink_records_size | Counter | Total size of backed up records (key + headers + payload) |
| backup_sink_bytes_count | Counter | Total number of bytes written to the storage sink (affected by chosen compression algorithm) |
| backup_sink_records_count | Counter | Total number of records written to the storage sink |
| backup_sink_records_filtered_count | Counter | Total number of records filtered out (user-defined filters, plugins, …) |
| backup_topic_progress | Gauge | Progress of the backup per topic |
| backup_topic_partition_progress | Gauge | Progress of the backup per topic partition |
| backup_sink_jobs_count | Counter | Number of backup jobs that are running |
| backup_sink_jobs_idleratio | Gauge | Indicator of time spent idle (0=fully busy, 1=fully idle) |
| jobs_state | Gauge | Number of jobs in a certain status (Created, Paused, Failed, Running, Done, Backoff) |
| backup_sink_partitions_count | Counter | Number of kannika partitions (.kan files) written to the storage |
Update interval
The metrics are updated every second.
Push metrics
The backup jobs push their metrics when a topic backup is started/stopped and periodically.
This is mainly intended for the API, so it can store historical data for when the backup is paused.
Metrics will be pushed to the event-gateway Kubernetes service which listens on port 8082.
The backup jobs will post their metrics to that service for each topic using the following paths:
/namespaces/{namespace}/backups/{backupName}/{backupUuid}/topics/{topicName}/started/namespaces/{namespace}/backups/{backupName}/{backupUuid}/topics/{topicName}/stopped
Configuration
By default, pushing metrics will be configured and enabled when using the main Helm chart.
It can be disabled by setting the operator.config.eventGateway.enabled flag to false.
The following properties can be used to change the event gateway service configuration which points to the API:
api: eventGateway: enabled: true service: name: "event-gateway" type: ClusterIP port: 8082 # port that the service will expose targetPort: 8082 # container port annotations: { } labels: { }To change where backups pushes the metrics to,
update the operator.config.eventGateway.service properties:
operator: config: eventGateway: enabled: true service: name: event-gateway namespace: "" port: 8082