Skip to content

Kafka EventHub

apiVersion: kannika.io/v1alpha
kind: EventHub
metadata:
name: production-kafka-cluster
spec:
kafka:
properties:
bootstrap.servers: "broker1:9092,broker2:9092" # Defines URIs for the bootstrap servers
compression.codec: "zstd" # Enables compression when used in a Restore
description: "Production EventHub" # Add an optional description here

Under the hood, Armory uses the rdkafka library to connect to Kafka-like clusters. Therefore, many additional properties can be used to configure the connection to the cluster.

Here are some properties that might be of interest:

  • acks: affects the reliability of a Restore.
  • queued.max.messages.kbytes: influences the memory usage and performance of a Backup job.
  • queue.buffering.max.kbytes: influences the memory usage and performance of a Restore job.

Please refer to rdkafka’s documentation.

Although you could set authentication-related properties (such as sasl.username, ssl.ca.location, etc.) in the EventHub’s spec, an EventHub resource is not intended to contain any credentials and those properties will be overridden when creating Backup or Restore resources.

Authentication should instead be defined using the Credentials resource. This provides more flexibility and allows re-using credentials across multiple Storages and EventHubs.

The supported authentication methods are:

Kannika Armory does not commit its offsets to the Kafka cluster. Instead, it uses the offsets stored in the Backup to determine where to start consuming messages from. However, in some cases, it might still be necessary to specify a consumer group (group.id) when connecting to a Kafka cluster.

By default, the kannika consumer group is used when no consumer group is specified.

If a consumer group is required, it is possible in the following locations:

Restores do not require a consumer group and will ignore any consumer group properties.

Adding the consumer group ID to a Backup’s additional source properties is the preferred way to specify a consumer group for a Backup.

apiVersion: kannika.io/v1alpha
kind: Backup
metadata:
name: backup-with-consumer-group
spec:
source: production-kafka-cluster
sourceAdditionalProps:
group.id: "my-consumer-group"
sink: my-storage

In this example, the consumer group ID is set to my-consumer-group for this specific Backup.

Defining the consumer group on an EventHub

Section titled “Defining the consumer group on an EventHub”

Adding the consumer group ID to the Kafka EventHub’s properties is also possible. This is the least flexible way to specify a consumer group, as it will apply to all backups that use this EventHub.

apiVersion: kannika.io/v1alpha
kind: EventHub
metadata:
name: production-kafka-cluster
spec:
kafka:
properties:
bootstrap.servers: "broker1:9092"
group.id: "kannika"

In this example, the consumer group ID is set to kannika for all backups and restores that use this EventHub.

This section describes the permissions required for the different components of the system to work properly.

The following permissions are required for the Backup processes to work properly: This list is not exhaustive and may change in the future.

Permission
Reason
Topic:ReadRequired to fetch data from the topic.
Topic:DescribeRequired to find the topic, and to retrieve topic metadata, such as offsets and watermarks.
Cluster:DescribeRequired to list topics (e.g. for automatically including topics)

Although not strictly required, the following permissions are recommended to avoid potential issues:

Permission
Reason
ConsumerGroup:Describe
(optional)
While the group.id property is not really used by the Backup, it is required to have one due to a bug in the librdkafka library that causes the library to try to fetch the group information even if it is not used. By default, the consumer group is set to kannika.

In case this permission is missing, a GroupAuthorizationFailed error will be logged, but the backup will continue to work.

The following permissions are required for the Restore processes to work properly. This list is not exhaustive and may change in the future.

Permission
Reason
Topic:WriteRequired to produce data to the topic.
Topic:DescribeRequired to find the topic, and to retrieve topic metadata, such as offsets and watermarks.

Anything that is compatible with the Kafka protocol should work.

The following platforms are known to work: