This technique might be used to democratise more data in Kafka across your organization, including to engineering and operations teams. It will also allow you to simplify how to integrate data from Kafka into your business applications.
But rather than querying the data directly against Kafka, we will query via Lenses.io. Lenses provides a governance portal for Apache Kafka.
Querying data into Kafka through Lenses has a few benefits:
Use of SQL: Lenses.io allows us to use SQL to query, filter and aggregate the data before visualizing it. This will reduce workload on the client side.
Deserialized payloads: Payload data in Kafka will also deserialize by Lenses whatever the format (AVRO, Protobuf, XML even proprietary formats via Serde).
Secure data access via namespaces and security tokens: Data access will be protected by role based access controls based on namespaces and access with service account tokens. This avoids you having to configure & manage ACLs in Kafka.
Last, note that if you want to visualize JSON data stored in a file, you can have a look at the Using D3 to visualize data blog post.
In order to be able to follow the steps of this tutorial, you will need the following:
Lenses up and running. You can use our free Lenses Box instance if necessary.
An Internet connection that will help you get D3.js and create the sample project.
We are going to read live data from a Kafka topic using Lenses API and a service account and visualize it using D3.js. The Kafka topic that will be used is called
This section will present the steps needed for implementing the described scenario beginning from Lenses.
nyc_yellow_taxi_trip_data Kafka topic contains records with the following kind of format (represented as JSON records):
Each record returned by the previous query contains 4 fields as defined in the
You will need to perform two steps in order to create a service account in Lenses:
Create a new Group that matches the requirements of that service account.
Create the service account and keep its token.
Note that the name of the service account should be unique in a Lenses installation.
You can find more information about creating a Lenses Service Account in this blog post. The name of the service account used in this tutorial will be
First, you should execute
git clone https://github.com/wbkd/webpack-starter.git inside the root directory of your project.
Then, you will then need to make changes to the following files:
If you decide to use our GitHub repository, you will need to execute the following command for creating the project:
git clone email@example.com:mactsouk/d3-service.git
You might need to make changes to
./src/scripts/index.js to match your Lenses installation or define your own SQL query.
The last thing you should do is execute
npm run start to begin your application. If you are using yarn, execute
yarn start instead.
In order to get data from Lenses, you will need to create a WebSocket connection. As we are using a service account, there is no need to login to Lenses using Lenses REST API and get the authentication token.
Once you know the format of the data, you can easily choose the fields that interest you and are going to be included in the visualization process.
This means that the
sum field will contain the amount of money paid per trip.
The radius of each circle depends on the number of passengers – the more passengers the taxi had, the bigger the circle.
In this section we are going to see the generated visualization. Note that each time you load the HTML file, you will get a slightly different output as we are working with dynamic data.
This section will present the main files of the project.
The contents of
package.json are the following:
The contents of
./src/index.html are the following:
The embedded CSS code is required for the correct rendering of the HTML output.
./src/scripts/index.js file is the most important file of the project:
Create the WebSocket connection as defined in the
The message is sent using the
onmessage() method is executed each time we receive new data from the WebSocket connection - this happens because we are expecting to get multiple JSON records back.
streamEvent is an object that has a
data attribute. That
data attribute has a
type property, which can be
When we are dealing with a new
RECORD, we add it to the existing records.
When there is an
ERROR, we print it in the console.
When we are dealing with
END, we terminate the connection.
After the connection is terminated and all data has been read,
websocketSubject.pipe(finalize(()) will return the data that is stored in the
websocketData variable is used by D3.js to visualize the desired data.
The contents of
.eslintrc are the following:
Now that you know how to visualize live data from Kafka topics through Lenses and D3.js, you should begin visualizing data from your own Kafka topics.