Kafka Connect is a great framework for moving data in and out of Kafka. But doing so requires connections with some of your most important data crown jewels. Customer information held in MongoDB, audits in a S3 bucket, payroll information in an Oracle database.
Connecting to these data stores requires authentication. And you don't exactly need to be Matthew Broderick in WarGames to know that authentication information, especially passwords, should not be stored in clear text config or code or disk for that matter.
By default, for most Kafka Connect connectors you will need to enter username and password information in clear text. This is unlikely to please your risk team and chances are, if your project is remotely strategic, it probably won't be signed off.
Keeping all your secrets locked-up in a Key Store such as Azure Key Vault is the way to go. We have developed a small plugin that allows you to manage your Kafka Connect secrets with Azure Key Vault. Here is a quick guide of setting up a MongoDB connector, which could also be applied to any other Kafka Connector that requires username and password information.
Pre-requisites
In order to be able to follow the steps of this tutorial, you will need the following:
Lenses up and running. You can use our free Lenses Box instance if necessary. Box has an instance of Kafka and Kafka Connect in a single Docker container.
A Kafka Connect instance running - every Lenses Box comes with Kafka Connect up and running.
A properly configured and accessible Azure Key Vault with the desired keys.
A properly configured and running MongoDB server that will be accessible from the Lenses machine.
The Gory Details
A Secret Provider is a place where you can keep sensitive information. External secret providers allow for indirect references to be placed in an applications configuration, so that secrets are not publicly exposed. In order for a Secret Provider plugin to work, it should be made available to the Connect workers in the cluster – this requires some extra configuration, which is what this tutorial is all about.
The recommended way to add a plugin to Connect is by using the
plugins.path
configuration and place the
jars
in a separate folder under this path – this provides classloader isolation. However, for Azure, we need to add the relevant
jar
to the
classpath
variable since the Azure SDK makes use of a service loader.
The Scenario
The Kafka Connector will need to authenticate to the MongoDB instance without exposing the username and password information in the connector configuration in clear text.
Both MongoDB username and password are already stored in Azure Key Vault.
The Implementation
Note that most of the config will be done via a terminal into Lenses Box container, unless you are using Lenses on its own Linux machine.
Provided that the name of the Docker container that runs Lenses Box is
lenses-dev
, you should execute the next command to connect to that Docker container and run the
bash
shell:
docker exec -it lenses-dev bash
At this point it would be useful to execute the following command in order to install the
vim
editor:
apk add vim
About Azure Key Vault
You will need to customise some settings in your Kafka Connect config for it to connect to Azure Key Vault. Therefore, you will need to get certain information from your Azure Key Vault administrator in order to move forward in this guide. The following is the data will be put in the worker properties file which we'll describe in more detail later:
Note that for reasons of security we are not displaying the actual values of the presented
client.id
,
secret.id
and
tenant.id
keys. You should ask your Azure administrator to give you the appropriate values for all these keys and for your own Azure Key Vault installation and substitute them.
Each Azure Key Vault password has the
{[provider]:[keyvault]:[secret-name]}
form where
[provider]
is the name we set for the Azure Secret Provider in the worker properties (
azure
),
[keyvault]
is the url of the Key Vault in Azure without the
https://
protocol because Kafka Connect uses
:
as its own separator and
[secret-name]
is the name of the secret/key that holds the value we want in the Key Vault.
Connecting Kafka Connect from Lenses Box with Azure Key Vault
All the keys from the previous section should be put at the end of the
/run/connect/connect-avro-distributed.properties
file (the worker properties file) in the Lenses Box container.
What this does is tell Kafka Connect that we want to use a configuration provider and that we want an Azure Secret Provider to be installed (
The Azure Secret Provider can be configured to either use Azure Managed Service Identity to connect to a Key Vault or a set of credentials for an Azure Service Principal.
Getting the Secret Provider
In this step, we are going to download a
jar
file from an existing GitHub project and put it in the right place. Doing that requires executing the following commands:
You should now execute the next command for changes to take effect:
supervisorctl update connect-distributed
The Azure Key Vault keys and values that will be used
The following information is stored into Azure Key Vault:
A key named
mongodb-username
with a value of
username
.
A key named
mongodb-password
with a value of
password
.
The interesting thing is that you do not really need to know about the values of
mongodb_username
and
mongodb_password
that are stored in Azure Key Vault. What you only need to know is the names of the keys because these will be used in the last step, which is the configuration of the MongoDB connector.
Kafka Connect Configuration
We are now ready to create and use a Kafka Connector that will write data to a MongoDB database that resides on the
mongo
machine.
Go to Lenses UI and select
Connectors
and after that click on
+ New Connector
. On the next screen select the
Mongo
Sink Connector. In the
Configure Connector
text box put the following information (yours might vary depending on the keys stored in Azure Key Vault, the hostname of the Key Vault, etc.):
Note that none of them contains any sensitive information in clear text! Once they are submitted, the Worker will resolve their values as stored in Azure Key Vault and make them available to the tasks so they can connect securely.
Now press on the
Create Connector
button and you are done! This will take you to the next screen:
Press on the
TASK-0
link to see how your MongoDB Sink is doing:
We are done!
Handling base64 values
Although not used here, it is good to know that the plugin has the ability to handle
base64
values as well as secret files stored to disk if required.
For example, a connector may require a
pem
file, which can be stored securely in Key Vault and downloaded for use in the connector. Extra care must be taken to secure the directory these secrets are stored on.
The Azure Secret Provider uses the
file-encoding
tag to determine this behaviour. The value for this tag can be:
UTF8
, which means that the value returned is the string retrieved for the secret key.
UTF8_FILE
, which means that the string contents will be written to a file. The returned value from the connector configuration key will be the location of the file. The file location is determined by the
file.dir
configuration option given to the provider via the worker properties file.
BASE64
, which means the value returned is the base64 decoded string retrieved for the secret key.
BASE64_FILE
, which means that the contents are base64 decoded and written to a file. The returned value from the connector configuration key will be the location of the file. The file location is determined by the
file.dir
configuration option given to the provider via the Connect worker file.
If no tag is found, the contents of the secret string are returned.
Next steps
Manage and operate all your Kafka Connect connectors and Kafka with the Lenses.io application and data operations portal. Get started here for free https://lenses.io/start/.