Skip to content

    Overview

    A Restore is used to restore messages from cold storage to a message source like Kafka.

    It has many different configuration options to allow for a wide range of use cases. You can use it to restore data from multiple topics in parallel, restore data until a certain point in time or until a certain offset.

    Another use case is to restore or migrate data from one cluster to another cluster. It supports mapping schemas between topics, renaming topics, and much more.

    Usage

    Restore resources can be managed using the kubectl command line tool, and are available by the name restore or restores.

    Terminal window
    $ kubectl get restores
    NAME STATUS AGE
    my-restore 🚀 Restoring 1s

    Creating a Restore

    The following is an example of a Restore. It will restore 2 topics from the my-volume-storage Storage to the my-kafka-cluster Endpoint.

    apiVersion: kannika.io/v1alpha
    kind: Restore
    metadata:
    name: restore-example
    spec:
    source: "my-volume-storage"
    sink: "my-kafka-cluster"
    enabled: true
    config:
    mapping:
    source-topicA:
    target: "target-topicA"
    source-topicB:
    target: "target-topicB"

    In this example:

    • A Restore named restore-example is created, indicated by the .metadata.name field. This name will become the basis for the Kubernetes Job (and other resources) which are created later.

    • The Restore will connect to the my-volume-storage Storage to fetch data, indicated by the .spec.source field. The Restore will restore data to the my-kafka-cluster Event Hub Endpoint defined in the spec.sink field.

    • The .spec.config.mapping field contains the mapping between the source and target topics. In this example, the topics source-topicA and source-topicB will be restored to target-topicA and target-topicB respectively.

    • The .spec.enabled field indicates whether the Restore should be enabled or not. Note that this field is optional and defaults to false. This means that the Restore will not be started unless this field is explicitly set to true. Please check the Drafting and starting a Restore section for more information.

    Drafting and starting a Restore

    The following is an example of a Draft Restore.

    apiVersion: kannika.io/v1alpha
    kind: Restore
    metadata:
    name: restore-example
    spec:
    source: "my-volume-storage"
    sink: "my-kafka-cluster"
    enabled: false
    config: {}

    This is a minimal Restore definition. In this state, the Restore will not be started (the Job will not be created yet) because the .spec.enabled field is not set to true. The Restore is in a Draft state.

    At this stage of the Restore lifecycle, the Restore can act as a working document, and be edited and modified as needed. More importantly, topics can be imported and mapped to target topics, or removed.

    To start the Restore, the .spec.enabled field must be set to true.

    Restore to multiple topics in parallel

    The following is an example of a draft Restore that can restore to multiple topics in parallel.

    apiVersion: kannika.io/v1alpha
    kind: Restore
    metadata:
    name: parallel-example
    spec:
    source: "my-volume-storage"
    sink: "my-sink"
    enabled: true
    config:
    parallelism: 3

    In this example, the .spec.config.parallelism field is set to 3. This means that the Restore will restore 3 topics in parallel.

    Use cases where this is useful:

    • when restoring a large number of topics
    • when restoring large, single partition topics

    Adding the legacy offset to the headers

    You can add the original offsets to the headers of the restored messages.

    The following is an example of a Restore that adds the original offset to the headers of the restored message.

    apiVersion: kannika.io/v1alpha
    kind: Restore
    metadata:
    name: legacy-offset-header-example
    spec:
    source: "my-volume-storage"
    sink: "sink"
    enabled: true
    config:
    legacyOffsetHeader: "original-offset"
    mapping:
    source-topic:
    target: "target-topic"

    In this example, the .spec.config.legacyOffsetHeader field is set to original-offset. This means that the original offset will be added to the headers of the restored messages with the key original-offset.

    If you wish to add the header only to a specific topic and not to all topics, you must split the Restore into multiple Restores.

    Restoring until a certain point in time

    You can restore data until a certain point in time.

    The following is an example of a Restore that restores data until a certain point in time.

    apiVersion: kannika.io/v1alpha
    kind: Restore
    metadata:
    name: restore-until-example
    spec:
    source: "my-volume-storage"
    sink: "sink"
    enabled: true
    config:
    restoreUntilDateTime: "2021-01-01T00:00:00Z"
    mapping:
    source-topic:
    target: "target-topic"

    In this example, the .spec.config.restoreUntilDateTime field is set to 2021-01-01T00:00:00Z. This means that the Restore will restore data until the 1st of January 2021 at midnight (UTC). This is an exclusive timestamp. The Restore will restore all messages with a timestamp before this point in time. Any messages with a timestamp after this point in time will not be restored.

    Restoring a partition until a certain offset

    You can restore data of specific partitions until a certain offset.

    The following is an example of a Restore that restores data until a certain offset for a specific partition.

    apiVersion: kannika.io/v1alpha
    kind: Restore
    metadata:
    name: restore-until-example
    spec:
    source: "source"
    sink: "sink"
    enabled: true
    config:
    mapping:
    source-topic:
    target: "target-topic"
    partitions:
    0:
    restoreUntilOffset: 100

    In this example, the .spec.config.mapping.source-topic.partitions.0.restoreUntilOffset field is set to 100. This means that the Restore will restore data until offset 100 for partition 0. This is an exclusive offset. Any messages with an offset after this offset will not be restored.

    Note that other partitions will not be restored if they are not explicitly defined when using the partitions field.

    Pre-flight checks

    A Restore performs pre-flight checks before starting the restore process for each topic. These checks are used to ensure that topic can be restored without any issues.

    The current checks are:

    • Check if the partition count of the source and target topics are the same

    Future checks that are planned are:

    • Check if the target topic exists
    • Check if the target topic is empty
    • Check if the target topic has the same configuration as the source topic

    Check the roadmap for more information.

    Disabling pre-flight checks

    In some cases, it can be necessary to disable the pre-flight checks. For example, when using the Topic Repartitioning plugin to change the amount of partitions of a topic.

    Fine-grained control over the pre-flight checks is not available yet, but you can disable all pre-flight checks for a specific topic by setting the disablePreflightChecks field to true in the mapping configuration.

    Example:

    apiVersion: kannika.io/v1alpha
    kind: Restore
    metadata:
    name: restore-without-preflight-checks
    spec:
    sink: "sink"
    source: "source"
    enabled: true
    config:
    mapping:
    source-topic-name:
    disablePreflightChecks: true # Disable pre-flight checks for this topic
    target: target-topic-name

    Mapping schemas

    You can map schemas between the source and target topics when restoring data. Check the Schema Mapping page for more details