Kafka EventHub
Synopsis
apiVersion: kannika.io/v1alphakind: EventHubmetadata: name: production-kafka-clusterspec: kafka: properties: bootstrap.servers: "broker1:9092,broker2:9092" # Defines URIs for the bootstrap servers compression.codec: "zstd" # Enables compression when used in a Restore description: "Production EventHub" # Add an optional description here
Advanced configuration
Under the hood, Armory uses the rdkafka
library to connect to Kafka-like clusters.
Therefore, many additional properties can be used to configure the connection to the cluster.
Here are some properties that might be of interest:
acks
: affects the reliability of a Restore.queued.max.messages.kbytes
: influences the memory usage and performance of a Backup job.queue.buffering.max.kbytes
: influences the memory usage and performance of a Restore job.
Please refer to rdkafka’s documentation.
Authentication
Although you could set authentication-related properties (such as sasl.username
, ssl.ca.location
, etc.) in the EventHub’s spec,
an EventHub resource is not intended to contain any credentials
and those properties will be overridden when creating Backup or Restore resources.
Authentication should instead be defined using the Credentials resource. This provides more flexibility and allows re-using credentials across multiple Storages and EventHubs.
The supported authentication methods are:
Consumer Group
Kannika Armory does not commit its offsets to the Kafka cluster.
Instead, it uses the offsets stored in the Backup to determine where to start consuming messages from.
However, in some cases, it might still be necessary to specify a consumer group (group.id
) when connecting to a Kafka cluster.
By default, the kannika
consumer group is used when no consumer group is specified.
If a consumer group is required, it is possible in the following locations:
Restores do not require a consumer group and will ignore any consumer group properties.
Defining the consumer group on a Backup
Adding the consumer group ID to a Backup’s additional source properties is the preferred way to specify a consumer group for a Backup.
apiVersion: kannika.io/v1alphakind: Backupmetadata: name: backup-with-consumer-groupspec: source: production-kafka-cluster sourceAdditionalProps: group.id: "my-consumer-group" sink: my-storage
In this example,
the consumer group ID is set to my-consumer-group
for this specific Backup.
Defining the consumer group on an EventHub
Adding the consumer group ID to the Kafka EventHub’s properties is also possible. This is the least flexible way to specify a consumer group, as it will apply to all backups that use this EventHub.
apiVersion: kannika.io/v1alphakind: EventHubmetadata: name: production-kafka-clusterspec: kafka: properties: bootstrap.servers: "broker1:9092" group.id: "kannika"
In this example,
the consumer group ID is set to kannika
for all backups and restores that use this EventHub.
Required Permissions
This section describes the permissions required for the different components of the system to work properly.
Backup
The following permissions are required for the Backup processes to work properly: This list is not exhaustive and may change in the future.
Permission | Reason |
---|---|
Topic:Read | Required to fetch data from the topic. |
Topic:Describe | Required to find the topic, and to retrieve topic metadata, such as offsets and watermarks. |
Cluster:Describe | Required to list topics (e.g. for automatically including topics) |
Although not strictly required, the following permissions are recommended to avoid potential issues:
Permission | Reason |
---|---|
ConsumerGroup:Describe (optional) | While the group.id property is not really used by the Backup, it is required to have one due to a bug in the librdkafka library that causes the library to try to fetch the group information even if it is not used. By default, the consumer group is set to kannika . In case this permission is missing, a GroupAuthorizationFailed error will be logged, but the backup will continue to work. |
Restore
The following permissions are required for the Restore processes to work properly. This list is not exhaustive and may change in the future.
Permission | Reason |
---|---|
Topic:Write | Required to produce data to the topic. |
Topic:Describe | Required to find the topic, and to retrieve topic metadata, such as offsets and watermarks. |
Supported Platforms
Anything that is compatible with the Kafka protocol should work.
The following platforms are known to work: