In this blog post will use a Debezium Connector to capture changes from a MySQL database, send them to Kafka and then move the events into Snowflake.
We will configure, deploy and monitor this flow using a single DataOps tool.
Lenses is lightweight and delivers DataOps for your data infrastructure on Apache Kafka and Kubernetes. It provides APIs and afor streaming data discovery and exploration. It can also monitor your infrastructure, data integration and micro-services, and help you troubleshoot your real-time data.
We will be using low-code methods to deploy data integration using Kafka Connect and will manage them via GitOps.
Debezium offers an open source collection of Kafka Connectors that can turn data stores into event streams. That means that you can easily move to event-based architectures and build logic that responds to row-level changes in a data store.
You can read more about the Debezium MySQL connector here.
CDC means Change Data Capture and is a design pattern that monitors and captures any changes in a data store by reading the bin-logs, avoiding the performance impact that a typical JDBC connector applies.
In our example, we will capture all the changes on a MySQL database using CDC and pass the change events to Kafka.
Snowflake is a Cloud data warehouse that can store both structured and semi-structured data.
You are going to need to have Lenses for this tutorial.
Lenses is just a small 4GB JVM that you point to your Kafka and Kubernetes (if you have one) clusters. To make things easier for this guide, we are going to use the Lenses+Kafka all-in-one free “Lenses Box” container. Download it for free here.
Additionally, you will need to have a valid account to Snowflake.
The Debezium Connector for MySQL is already installed on Lenses Box so you will just need to configure it in order to communicate with your MySQL database.
The scenario that is going to be implemented has the following steps:
Setup a MySQL database with sample data.
Import data from the MySQL to a Lenses Box with Kafka using Debezium.
Export the data into Snowflake.
You do not need to do something special in the MySQL database that you are using. What is important is that MySQL is running and that you know the TCP port, username and password required to connect. For the purposes of this tutorial, MySQL will run on a Docker image. You can get the MySQL Docker image as follows:
docker pull debezium/example-mysql:0.10
Notice that this Docker image contains sample data on a database named
which is the one that is going to be monitored.
After that you can execute that Docker image as follows:
The username that we will be using is
root and the password is
debezium – both will
be used when setting up the Debezium Connector. Also notice that the name of the image
as defined using
--name mysql is
If you do not want to keep the Debezium Docker image on your computer, you can bypass the
docker pull step and execute the
docker run command with the
--rm parameter, which tells Docker to remove the container when it stops.
If you want to connect to that Docker image using the
sqlclient utility, you can execute
the following command, which runs the
mysql utility inside a Docker image named
From there you can inspect the contents of the
inventory MySQL database and make
changes to the contents of its tables.
Setting up Lenses Box container is as simple as executing the command:
IMPORTANT: When you registered for the “Lenses Box”, you would have received the above docker command with a valid free license (instead of the “xxxxxxxxx” in the id parameter). Check your emails for your unique command.
You need the
--link mysql:mysql parameter so that the Lenses Box image can find
the Docker image with MySQL, which is going to be called
mysql. Notice that a
MySQL Docker image named
mysql should already be running for the above
command to work.
You might need to wait a little for the Lenses Docker image to load all necessary
components. After that you will be able to connect to Lenses Box from your favorite
web browser on port 3030 – the username and the password are both
After connecting, you will see the image that follows on your web browser.
You will need a Snowflake account in order to be able to use it. The easiest way is to visit https://trial.snowflake.com and create a trial account.
For the purposes of this tutorial the Snowflake username is
mihalis. Notice that
the password will not be needed for creating the connection between Lenses and
Snowflake. However, you will still need your Snowflake password for connecting to
the Snowflake UI.
The name of the Snowflake server is
re91701.eu-west-1.snowflakecomputing.com. Yours will be different, That means
if you try to follow this tutorial, you will have to replace
re91701.eu-west-1.snowflakecomputing.com with your Snowflake server address.
We will need to install the Snowflake Kafka Connector to a Lenses Box. We will be using version 0.4.0 of the Snowflake Kafka Connector. Please visit this page for the full list of available Snowflake Kafka Connectors.
You will need to connect to the Docker image of the Lenses Box with the following command:
docker exec -it lenses-dev bash
Notice that the name of the Lenses Box being used is
lenses-dev and was specified
The previous step is required because you will need to install the Snowflake Connector inside the running Lenses Box.
You will need to download the Snowflake Kafka Connector with the command:
Put the downloaded file in the right directory with:
mv snowflake-kafka-connector-0.4.0.jar /run/connect/connectors/third-party/
Congratulations, you have just installed the Snowflake Kafka Connector jar file by
putting it in the
/run/connect/connectors/third-party/ directory of Lenses Box.
However, you will need to edit
/run/lenses/lenses.conf and add the following text
at the end of the Lenses configuration file in order for Lenses to understand the
Execute the following commands, which are needed for the Snowflake Connector to work:
The last commands that you need to execute is the following:
The former command tells Lenses to search for new Connectors whereas the latter
command restarts Lenses in order to reread the new configuration from
In order to make sure that the Snowflake Kafka Connector is property installed you will need to visit the Connectors link inside the Lenses Box and press the + New Connector button.
The following image proves that the Snowflake Connector is recognized by the Lenses Box and therefore is properly installed.
You can find more information about the Snowflake Kafka Connector here.
As stated before, we are going to use the Debezium Connector to get row-level changes from MySQL to Kafka.
We will now create a new Connector using the Debezium MySQL Connector available in Lenses Box. The Debezium configuration will be the following:
The name of the MySQL database that we are going to watch is
inventory and exists
debezium/example-mysql:0.10 Docker image.
Now, connect to the MySQL database using the command we talked about earlier:
Then, create a new table named
profits to the
inventory database as follows:
Within Lenses, go to the Topics >> schema-changes.inventory
There will be a new record to the
schema-changes.inventory Kafka topic, which
in JSON format will look similar to the following:
This proves that everything works well with Debezium and Lenses.
So, imagine that you want to transfer data from
In this case, you will need to rename that topic into something simpler because it looks like Snowflake hates dot characters in topic names. The easiest way to do that is using a SQL processor (SQL Processor >> New Processor) that will continuously process events and transform them via an SQL statement.
For the purposes of this tutorial, the code for the SQL processor will be the following:
The new name for
dbserver1.inventory.products will be
mytopic – however the
dbserver1.inventory.products still exists. The following figure shows that in
a graphical way.
After the installation of the Snowflake Kafka Connector, the process of connecting Lenses and Snowflake is straightforward.
Create a new Sink using the Snowflake Connector in Connectors >> New Connector >> Snowflake. The configuration for the Snowflake Sink in the Lenses Box will be the following:
In order to get the correct value for
snowflake.private.key you will need to visit
and follow the instructions.
Visiting Snowflake will verify that the data from Lenses Box is sent to Snowflake
successfully. Data is read from
mytopic in Lenses and written to
Snowflake. The Snowflake database and schema used are
Lenses provides a DataOps layer to configure, manage, debug and deploy flows. You can manage these flows via GitOps and leverage Lenses to bake in monitoring, alerting and data compliance. Check out our other blogs for more info.
Get familar with everything you can do with Lenses Box in this 3-min tour