Kafka Connect is a great framework for moving data in and out of Kafka. But doing so requires connections with some of your most important data crown jewels. Customer information held in MongoDB, audits in a S3 bucket, payroll information in an Oracle database.
Connecting to these data stores requires authentication. And you don't exactly need to be Matthew Broderick in WarGames to know that authentication information, especially passwords, should not be stored in clear text config or code or disk for that matter.
By default, for most Kafka Connect connectors you will need to enter username and password information in clear text. This is unlikely to please your risk team and chances are, if your project is remotely strategic, it probably won't be signed off.
Keeping all your secrets locked-up in a Key Store such as Azure Key Vault is the way to go. We have developed a small plugin that allows you to manage your Kafka Connect secrets with Azure Key Vault. Here is a quick guide of setting up a MongoDB connector, which could also be applied to any other Kafka Connector that requires username and password information.
Pre-requisites
In order to be able to follow the steps of this tutorial, you will need the following:
Lenses up and running. You can use our free Lenses Box instance if necessary. Box has an instance of Kafka and Kafka Connect in a single Docker container.
A Kafka Connect instance running - every Lenses Box comes with Kafka Connect up and running.
A properly configured and accessible Azure Key Vault with the desired keys.
A properly configured and running MongoDB server that will be accessible from the Lenses machine.
The Gory Details
A Secret Provider is a place where you can keep sensitive information. External secret providers allow for indirect references to be placed in an applications configuration, so that secrets are not publicly exposed. In order for a Secret Provider plugin to work, it should be made available to the Connect workers in the cluster – this requires some extra configuration, which is what this tutorial is all about.
The recommended way to add a plugin to Connect is by using the configuration and place the plugins.path
in a separate folder under this path – this provides classloader isolation. However, for Azure, we need to add the relevant jars
to the jar
variable since the Azure SDK makes use of a service loader.classpath
The Scenario
The Kafka Connector will need to authenticate to the MongoDB instance without exposing the username and password information in the connector configuration in clear text.
Both MongoDB username and password are already stored in Azure Key Vault.
The Implementation
Note that most of the config will be done via a terminal into Lenses Box container, unless you are using Lenses on its own Linux machine.
Provided that the name of the Docker container that runs Lenses Box is , you should execute the next command to connect to that Docker container and run the lenses-dev
shell:bash
docker exec -it lenses-dev bash
At this point it would be useful to execute the following command in order to install the editor:vim
apk add vim
About Azure Key Vault
You will need to customise some settings in your Kafka Connect config for it to connect to Azure Key Vault. Therefore, you will need to get certain information from your Azure Key Vault administrator in order to move forward in this guide. The following is the data will be put in the worker properties file which we'll describe in more detail later:
```
config.providers=azure
config.providers.azure.class=io.lenses.connect.secrets.providers.AzureSecretProvider
config.providers.azure.param.azure.auth.method=credentials
config.providers.azure.param.azure.client.id=12341-312314-54123123-abc
config.providers.azure.param.azure.secret.id=12343123-12312aabca-123124
config.providers.azure.param.azure.tenant.id=1234-523432-1213124
config.providers.azure.param.file.dir=/connector-files/azure
```
Note that for reasons of security we are not displaying the actual values of the presented , client.id
and secret.id
keys. You should ask your Azure administrator to give you the appropriate values for all these keys and for your own Azure Key Vault installation and substitute them.tenant.id
Each Azure Key Vault password has the form where {[provider]:[keyvault]:[secret-name]} is the name we set for the Azure Secret Provider in the worker properties ([provider]
), azure
is the url of the Key Vault in Azure without the [keyvault]
protocol because Kafka Connect uses https://
as its own separator and :
is the name of the secret/key that holds the value we want in the Key Vault.[secret-name]
Connecting Kafka Connect from Lenses Box with Azure Key Vault
All the keys from the previous section should be put at the end of the file (the worker properties file) in the Lenses Box container./run/connect/connect-avro-distributed.properties
What this does is tell Kafka Connect that we want to use a configuration provider and that we want an Azure Secret Provider to be installed () and the name of the class to instantiate (azure
) for this provider and its properties.io.lenses.connect.secrets.providers.AzureSecretProvider
The Azure Secret Provider can be configured to either use Azure Managed Service Identity to connect to a Key Vault or a set of credentials for an Azure Service Principal.
Getting the Secret Provider
In this step, we are going to download a file from an existing GitHub project and put it in the right place. Doing that requires executing the following commands:jar
```
wget https://github.com/lensesio/secret-provider/releases/download/0.0.1/secret-provider-0.0.1-all.jar
cp ./secret-provider-0.0.1-all.jar /connectors
```
The last step is needed for putting the generated file into the jar
directory of Lenses Box./connectors
With this plugin we can provide an indirect reference for the Kafka Connect worker to resolve at runtime.
Telling Kafka Connect about secret-provider-0.0.1-all.jar
secret-provider-0.0.1-all.jar
Now, we will need to edit in order to inform Kafka Connect about /etc/supervisord.d/05-connect-distributed.conf
.secret-provider-0.0.1-all.jar
The final contents of will be as follows:
```
[program:connect-distributed]
user=nobody
environment=JMX_PORT=9584
command=bash -c 'eval /usr/local/share/landoop/wait-scripts/wait-for-registry.sh; export CLASSPATH="/connectors/secret-provider-0.0.1-all.jar"; exec /opt/landoop/kafka/bin/connect-distributed /var/run/connect/connect-avro-distributed.properties'
redirect_stderr=true
stdout_logfile=/var/log/connect-distributed.log
startretries=5
```
What was added in is the following:/etc/supervisord.d/05-connect-distributed.conf
.export CLASSPATH="/connectors/secret-provider-0.0.1-all.jar";
You should now execute the next command for changes to take effect:
supervisorctl update connect-distributed
The Azure Key Vault keys and values that will be used
The following information is stored into Azure Key Vault:
A key named
with a value ofmongodb-username
.username
A key named
with a value ofmongodb-password
.password
The interesting thing is that you do not really need to know about the values of and mongodb_username
that are stored in Azure Key Vault. What you only need to know is the names of the keys because these will be used in the last step, which is the configuration of the MongoDB connector.mongodb_password
Kafka Connect Configuration
We are now ready to create and use a Kafka Connector that will write data to a MongoDB database that resides on the machine.mongo
Go to Lenses UI and select and after that click on Connectors
. On the next screen select the + New Connector
Sink Connector. In the Mongo
text box put the following information (yours might vary depending on the keys stored in Azure Key Vault, the hostname of the Key Vault, etc.):Configure Connector
```
connector.class=com.datamountaineer.streamreactor.connect.mongodb.sink.MongoSinkConnector
connect.mongo.retry.interval=60000
errors.log.include.messages=false
tasks.max=1
connect.mongo.password=${azure:lenses-euw-keyvault.vault.azure.net:mongodb-password}
errors.deadletterqueue.context.headers.enable=false
connect.mongo.auth.source=admin
connect.mongo.connection=mongodb://mongo:27017/?authSource=admin
connect.mongo.auth.mechanism=SCRAM-SHA-256
errors.deadletterqueue.topic.replication.factor=3
value.converter=org.apache.kafka.connect.json.JsonConverter
config.action.reload=restart
errors.log.enable=false
key.converter=org.apache.kafka.connect.json.JsonConverter
errors.retry.timeout=0
topics=backblaze_smart
connect.mongo.username=${azure:lenses-euw-keyvault.vault.azure.net:mongodb-username}
errors.retry.delay.max.ms=60000
connect.progress.enabled=true
connect.mongo.batch.size=10
key.converter.schemas.enable=false
connect.mongo.kcql=INSERT INTO sysstatshdd SELECT * FROM backblaze_smart
connect.mongo.error.policy=THROW
value.converter.schemas.enable=false
name=mongo-sink3
errors.tolerance=none
connect.mongo.db=admin
connect.mongo.max.retries=30
```The MongoDB username is represented by the next line:
connect.mongo.username=${azure:lenses-euw-keyvault.vault.azure.net:mongodb-username}
Similarly, the MongoDB password is represented by the next line:
.connect.mongo.password=${azure:lenses-euw-keyvault.vault.azure.net:mongodb-password}
Note that none of them contains any sensitive information in clear text! Once they are submitted, the Worker will resolve their values as stored in Azure Key Vault and make them available to the tasks so they can connect securely.
Now press on the button and you are done! This will take you to the next screen:Create Connector

Press on the link to see how your MongoDB Sink is doing:TASK-0

We are done!
Handling base64 values
Although not used here, it is good to know that the plugin has the ability to handle values as well as secret files stored to disk if required.base64
For example, a connector may require a file, which can be stored securely in Key Vault and downloaded for use in the connector. Extra care must be taken to secure the directory these secrets are stored on.pem
The Azure Secret Provider uses the tag to determine this behaviour. The value for this tag can be:file-encoding
, which means that the value returned is the string retrieved for the secret key.UTF8
, which means that the string contents will be written to a file. The returned value from the connector configuration key will be the location of the file. The file location is determined by theUTF8_FILE
configuration option given to the provider via the worker properties file.file.dir
, which means the value returned is the base64 decoded string retrieved for the secret key.BASE64
, which means that the contents are base64 decoded and written to a file. The returned value from the connector configuration key will be the location of the file. The file location is determined by theBASE64_FILE
configuration option given to the provider via the Connect worker file.file.dir
If no tag is found, the contents of the secret string are returned.
Next steps
Manage and operate all your Kafka Connect connectors and Kafka with the Lenses.io application and data operations portal. Get started here for free https://lenses.io/start/.







