Skip to content

Migrate consumer groups

A consumer group is a set of consumers that work together to consume messages from one or more topics. Each consumer in the group is assigned a subset of partitions, and Kafka tracks the offset (position) of each consumer group per partition.

This offset tells Kafka which messages have already been processed, so consumers can resume from where they left off after a restart or failure.

When cloning data from one Kafka environment to another (e.g., production to QA), consumer group offsets cannot be directly transferred. Kafka assigns new offsets to restored messages, which means your consumers would either re-process all messages from the beginning, or skip messages entirely.

Kannika Armory and Kannika Bridge work together to solve this problem:

  1. Armory restores messages with a special header containing the original offset.
  2. Bridge reads these headers, calculates the equivalent target offsets and applies the offsets in the target cluster.

This allows your consumers in the target environment to continue exactly where they left off.

  • A Kannika Armory instance available, running on a Kubernetes environment.
  • Local installation of the kbridge binary.
  • Local installation of the kubectl binary.
  • Local installation of sed (included by default on macOS and Linux).

Refer to the Setup section to set up the lab environment.

In this scenario, you will simulate migrating data from a production environment to a QA environment:

  • Source topic: orders-prod (simulating production) - contains 5 order messages
  • Target topic: orders-qa (simulating QA environment) - empty, ready for restore
  • Consumer group: order-processor - has processed 3 of 5 messages in production

The goal is to restore messages to the QA topic and migrate the consumer group offsets, so that consumers can continue processing from message 4 (not re-processing messages 1-3).

Run the setup script:

Terminal window
curl -fsSL https://raw.githubusercontent.com/kannika-io/armory-examples/main/install.sh | bash -s -- migrate-consumer-groups

Or clone the armory-examples repository:

Terminal window
git clone https://github.com/kannika-io/armory-examples.git
cd armory-examples
./setup migrate-consumer-groups

This sets up:

Kubernetes cluster: kannika-kind
├── Namespace: kannika-system
│ └── Kannika Armory
└── Namespace: kannika-data
├── EventHub: prod-kafka → kafka-source:9092
├── EventHub: qa-kafka → kafka-target:9093
├── Storage: prod-storage
└── Backup: prod-backup
Kafka: kafka-source:9092 (localhost:9092)
├── Topic: orders-prod (5 messages starting at offset 100)
└── ConsumerGroup: order-processor (offset: 103)
Kafka: kafka-target:9093 (localhost:9093)
└── Topic: orders-qa (empty)

Before restoring, pause the backup to ensure the latest data is flushed to storage:

Terminal window
kubectl patch backup prod-backup -n kannika-data \
--type merge \
-p '{"spec":{"enabled":false}}'

Verify the backup is paused:

Terminal window
kubectl get backup prod-backup -n kannika-data
NAME STATUS
prod-backup Paused

Step 2: Restore messages with Kannika Armory

Section titled “Step 2: Restore messages with Kannika Armory”

Create a Restore with the legacyOffsetHeader option to preserve original offsets in message headers:

Terminal window
kubectl apply -f tutorials/migrate-consumer-groups/restore-prod-to-qa.yaml
tutorials/migrate-consumer-groups/restore.yaml
# Restore job that restores orders-prod from backup to orders-qa
# The legacyOffsetHeader preserves original offsets for consumer group migration
apiVersion: kannika.io/v1alpha
kind: Restore
metadata:
name: restore-prod-to-qa
namespace: kannika-data
spec:
source: prod-storage # Storage to read backup data from
sink: qa-kafka # EventHub to restore to
enabled: true # Start restoring immediately
config:
legacyOffsetHeader: "__original_offset" # Header for original offset (for consumer group translation)
topics: # Topics to restore
- source: orders-prod # Topic name in backup
target: orders-qa # Topic name to restore to in EventHub

Monitor progress:

Terminal window
kubectl get restore restore-prod-to-qa -n kannika-data -w

Wait for the status to show Done.

Each restored message now contains an __original_offset header with its original offset from the source topic.

Step 3: Migrate consumer offsets with Kannika Bridge

Section titled “Step 3: Migrate consumer offsets with Kannika Bridge”

The next step is to migrate the consumer group offsets from the source topic to the target topic. We can achieve this using the three-step workflow provided by Kannika Bridge:

  • Fetch the current offsets from the source topic
  • Calculate the equivalent offsets on the target topic
  • Apply the calculated offsets to the target topic

Fetch consumer group offsets from the source

Section titled “Fetch consumer group offsets from the source”

Fetch the current consumer group offsets from the source topic:

Terminal window
kbridge fetch -b $(docker compose port kafka-source 9092) \
-t orders-prod \
> source_offsets.csv

View the fetched offsets:

Terminal window
cat source_offsets.csv
order-processor,orders-prod,0,103

This shows the order-processor consumer group has processed up to offset 3 on orders-prod.

The source offsets reference the source topic (orders-prod), but we need to calculate offsets for the target topic (orders-qa). Update the topic names in the CSV:

Terminal window
sed -i '' 's/orders-prod/orders-qa/g' source_offsets.csv

Alternatively, use yq to extract the topic mappings from the restore config:

Terminal window
yq '.spec.config.topics[] | "s/\(.source)/\(.target)/g"' tutorials/migrate-consumer-groups/restore-prod-to-qa.yaml \
| xargs -I{} sed -i '' '{}' source_offsets.csv

Verify that the topics have been renamed:

Terminal window
cat source_offsets.csv
order-processor,orders-qa,0,103

Use the calculate command to find the equivalent offsets on the target topic. It reads the __original_offset header from restored messages:

Terminal window
kbridge calculate -b $(docker compose port kafka-target 9093) \
-H __original_offset \
-i source_offsets.csv \
> target_offsets.csv

The -H flag specifies the header name that contains the original offset. This must match the legacyOffsetHeader value from the Restore configuration.

View the calculated offsets (note the topic has changed to orders-qa):

Terminal window
cat target_offsets.csv
order-processor,orders-qa,0,3

Apply the calculated offsets to create the consumer group on the target topic:

Terminal window
kbridge apply -b $(docker-compose port kafka-target 9093) -i target_offsets.csv

After applying offsets, verify they were set correctly:

Terminal window
docker exec kafka-target kafka-consumer-groups \
--bootstrap-server localhost:9093 \
--group order-processor \
--describe

The consumer group should now show offsets for orders-qa:

GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG
order-processor orders-qa 0 3 5 2

The current offset (3) corresponds to the original offset 103 in production. The LAG of 2 shows there are 2 remaining messages to process.

When you start consuming from orders-qa with the order-processor consumer group, it will continue from where it left off in production. The first 3 messages that were already processed are skipped.

To clean up all resources, run the teardown script:

Terminal window
./teardown

In this tutorial, you learned how to:

  1. Restore data with the legacyOffsetHeader option to preserve original offsets in message headers
  2. Use the kbridge three-step workflow (fetch, calculate, apply) to migrate consumer group offsets
  3. Verify that consumers can continue from the correct position

This workflow ensures seamless environment cloning with consumer state preservation.