Kafka EventHub

Synopsis

apiVersion: kannika.io/v1alpha
kind: EventHub
metadata:
  name: production-kafka-cluster
spec:
  kafka:
    properties:
      bootstrap.servers: "broker1:9092,broker2:9092" # Defines URIs for the bootstrap servers
      compression.codec: "zstd"            # Enables compression when used in a Restore
    description: "Production EventHub"     # Add an optional description here

Advanced configuration

Under the hood, Armory uses the rdkafka library to connect to Kafka-like clusters. Therefore, many additional properties can be used to configure the connection to the cluster.

Here are some properties that might be of interest:

acks: affects the reliability of a Restore.
queued.max.messages.kbytes: influences the memory usage and performance of a Backup job.
queue.buffering.max.kbytes: influences the memory usage and performance of a Restore job.

Please refer to rdkafka’s documentation.

Authentication

Although you could set authentication-related properties (such as sasl.username, ssl.ca.location, etc.) in the EventHub’s spec, an EventHub resource is not intended to contain any credentials and those properties will be overridden when creating Backup or Restore resources.

Authentication should instead be defined using the Credentials resource. This provides more flexibility and allows re-using credentials across multiple Storages and EventHubs.

The supported authentication methods are:

Consumer Group

Kannika Armory does not commit its offsets to the Kafka cluster. Instead, it uses the offsets stored in the Backup to determine where to start consuming messages from. However, in some cases, it might still be necessary to specify a consumer group (group.id) when connecting to a Kafka cluster.

By default, the kannika consumer group is used when no consumer group is specified.

If a consumer group is required, it is possible in the following locations:

In the Backup’s additional source properties
In the Kafka EventHub’s properties

Restores do not require a consumer group and will ignore any consumer group properties.

Defining the consumer group on a Backup

Adding the consumer group ID to a Backup’s additional source properties is the preferred way to specify a consumer group for a Backup.

apiVersion: kannika.io/v1alpha
kind: Backup
metadata:
  name: backup-with-consumer-group
spec:
  source: production-kafka-cluster
  sourceAdditionalProps:
    group.id: "my-consumer-group"
  sink: my-storage

In this example, the consumer group ID is set to my-consumer-group for this specific Backup.

Defining the consumer group on an EventHub

Adding the consumer group ID to the Kafka EventHub’s properties is also possible. This is the least flexible way to specify a consumer group, as it will apply to all backups that use this EventHub.

apiVersion: kannika.io/v1alpha
kind: EventHub
metadata:
  name: production-kafka-cluster
spec:
  kafka:
    properties:
      bootstrap.servers: "broker1:9092"
      group.id: "kannika"

In this example, the consumer group ID is set to kannika for all backups and restores that use this EventHub.

Required Permissions

This section describes the permissions required for the different components of the system to work properly.

Backup

The following permissions are required for the Backup processes to work properly: This list is not exhaustive and may change in the future.

Permission	Reason
`Topic:Read`	Required to fetch data from the topic.
`Topic:Describe`	Required to find the topic, and to retrieve topic metadata, such as offsets and watermarks.
`Cluster:Describe`	Required to list topics (e.g. for automatically including topics)

Although not strictly required, the following permissions are recommended to avoid potential issues:

Permission	Reason
`ConsumerGroup:Describe` (optional)	While the `group.id` property is not really used by the Backup, it is required to have one due to a bug in the librdkafka library that causes the library to try to fetch the group information even if it is not used. By default, the consumer group is set to `kannika`. In case this permission is missing, a `GroupAuthorizationFailed` error will be logged, but the backup will continue to work.

Restore

The following permissions are required for the Restore processes to work properly. This list is not exhaustive and may change in the future.

Permission	Reason
`Topic:Write`	Required to produce data to the topic.
`Topic:Describe`	Required to find the topic, and to retrieve topic metadata, such as offsets and watermarks.

Supported Platforms

Anything that is compatible with the Kafka protocol should work.

The following platforms are known to work: