Schema Mapping
This page describes how to configure schema mapping in a Restore.
Schema Mapping
Schema mapping allows migration of data from one environment to another that use schema registries.
With schema mapping, you can, for example, copy data from a production environment to a QA environment where there are different schema registries.
The restore process will map the schema IDs from the source environment to the schema IDs of the target environment. It will do so by looking up the schema mapping in a lookup table and replacing the magic byte that represents the schema ID in the message when restoring the data.
To use schema mapping, you need to:
- Define or generate a schema mapping
- Create a ConfigMap with the schema mapping
- Reference the ConfigMap in the Restore for schema mapping
The operator will automatically configure both the Payload Schema Mapping plugin, and the Key Schema Mapping plugin with the schema mapping.
Generating a schema mapping using SAME
To generate a schema mapping, you can use our Schema Automated Mapping Engine (SAME) tool. SAME is a tool that can generate schema mappings between two schema registries. It does this by downloading the schemas from the source and target schema registries, indexing them using fingerprinting (hashing), and then comparing the fingerprints to generate a mapping. The tool is still in its early stages, and currently only supports Avro.
First, create a YAML file with the schema registries you want to map:
Then, to generate a schema mapping,
run the following command using the quay.io/kannika/same
Docker image:
In this example:
-v .:/usr/var/same
mounts the current directory to the/usr/var/same
directory in the containerquay.io/kannika/same:0.2.1
is the Docker image-v
enables verbose modemap
is the command that maps the schemas--from
specifies the source schema registry--to
specifies the target schema registry--ignore-indexing-errors
ignores indexing errors-o /usr/var/same/mapping.yaml
specifies the output file for the schema mapping--registries /usr/var/same/registries.yaml
specifies the input file with the schema registries
After running the command,
you will have a mapping.yaml
file with the schema mapping.
Example:
Configuring a Restore to use schema mapping
First, create a ConfigMap with the schema mapping:
This will create a ConfigMap with the following content:
Finally,
set the schema mapping in the Restore by setting the .spec.config.schemaMappingFrom
field to load the mapping from the field in the ConfigMap,
using a ConfigMapKeySelector.
The operator will validate the schema mapping.
To check if the schema mapping is valid,
check the SchemaMappingValidated
condition in the status of the Restore resource.
What if no schema mapping is found for a schema ID?
In case the Restore encounters a schema ID that is not found in the schema mapping, it will skip the schema mapping for that schema ID and restore the data as is.