Visualize Kafka data in your web apps with D3.js and SQL

How to query data in an Apache Kafka topic and visualise it in a web application with D3.js

Apr 21, 2020

Visualize Kafka data in your web apps with D3.js and SQL

This tutorial will walk you through visualizing live data in a Kafka topic using the D3.js JavaScript library.

This technique might be used to democratise more data in Kafka across your organization, including to engineering and operations teams. It will also allow you to simplify how to integrate data from Kafka into your business applications.

But rather than querying the data directly against Kafka, we will query via Lenses.io. Lenses provides a governance portal for Apache Kafka.

Querying data into Kafka through Lenses has a few benefits:

Use of SQL: Lenses.io allows us to use SQL to query, filter and aggregate the data before visualizing it. This will reduce workload on the client side.
Deserialized payloads: Payload data in Kafka will also deserialize by Lenses whatever the format (AVRO, Protobuf, XML even proprietary formats via Serde).
Secure data access via namespaces and security tokens: Data access will be protected by role based access controls based on namespaces and access with service account tokens. This avoids you having to configure & manage ACLs in Kafka.

Last, note that if you want to visualize JSON data stored in a file, you can have a look at the Using D3 to visualize data blog post.

Pre-requisites

In order to be able to follow the steps of this tutorial, you will need the following:

Lenses up and running. You can use our free Lenses Box instance if necessary.
An Internet connection that will help you get D3.js and create the sample project.

The Scenario

We are going to read live data from a Kafka topic using Lenses API and a service account and visualize it using D3.js. The Kafka topic that will be used is called nyc_yellow_taxi_trip_data.

The Implementation

This section will present the steps needed for implementing the described scenario beginning from Lenses.

About Lenses

The nyc_yellow_taxi_trip_data Kafka topic contains records with the following kind of format (represented as JSON records):

The query that is going to be executed in the JavaScript code is the following:

Each record returned by the previous query contains 4 fields as defined in the SELECT statement.

Creating a Lenses Service Account

You will need to perform two steps in order to create a service account in Lenses:

Create a new Group that matches the requirements of that service account.
Create the service account and keep its token.

Note that the name of the service account should be unique in a Lenses installation.

You can find more information about creating a Lenses Service Account in this blog post. The name of the service account used in this tutorial will be service.

How to create the JavaScript project

In order to create the JavaScript project, you will need to execute some commands.

First, you should execute git clone https://github.com/wbkd/webpack-starter.git inside the root directory of your project.

Then, you will then need to make changes to the following files:

package.json
./src/index.html
./src/scripts/index.js

After that you will need to execute one of the following two commands depending on the JavaScript package manager that you are using:

npm install if you are using the npm JavaScript package manager.
yarn if you are using the yarn JavaScript package manager.

If you decide to use our GitHub repository, you will need to execute the following command for creating the project:

git clone git@github.com:mactsouk/d3-service.git

You might need to make changes to ./src/scripts/index.js to match your Lenses installation or define your own SQL query.

The last thing you should do is execute npm run start to begin your application. If you are using yarn, execute yarn start instead.

How to get data from Lenses

The following JavaScript code is used for getting the data from Lenses.

In order to get data from Lenses, you will need to create a WebSocket connection. As we are using a service account, there is no need to login to Lenses using Lenses REST API and get the authentication token.

The Format of the Data

In this section you will learn more about the format of the JSON records read from Lenses using JavaScript. The records returned by the SQL query have the following format:

Once you know the format of the data, you can easily choose the fields that interest you and are going to be included in the visualization process.

The returned data will be processed in the JavaScript script using the following code:

This means that the sum field will contain the amount of money paid per trip.

Visualizing Data

This section will show the core JavaScript code used for visualizing the data. As mentioned before, we are going to use the D3.js library.

The JavaScript code that draws the dots and generates the tooltips is the following:

The radius of each circle depends on the number of passengers – the more passengers the taxi had, the bigger the circle.

The Final Output

In this section we are going to see the generated visualization. Note that each time you load the HTML file, you will get a slightly different output as we are working with dynamic data.

Presenting the Project Files

This section will present the main files of the project.

The `package.json` file

The contents of package.json are the following:

The dependencies block is where you define the JavaScript packages that need to be downloaded for the project to execute.

The `./src/index.html` file

The contents of ./src/index.html are the following:

The embedded CSS code is required for the correct rendering of the HTML output.

The `./src/scripts/index.js` file

The ./src/scripts/index.js file is the most important file of the project:

This file contains all the JavaScript code. Its flow goes like this:

Create the WebSocket connection as defined in the firstMessage object.
The message is sent using the webSocketRequest.send(JSON.stringify(firstMessage)) call.
The onmessage() method is executed each time we receive new data from the WebSocket connection - this happens because we are expecting to get multiple JSON records back.
This streamEvent is an object that has a data attribute. That data attribute has a type property, which can be RECORD, END or ERROR.
When we are dealing with a new RECORD, we add it to the existing records.
When there is an ERROR, we print it in the console.
When we are dealing with END, we terminate the connection.
After the connection is terminated and all data has been read, websocketSubject.pipe(finalize(()) will return the data that is stored in the websocketData variable.
The websocketData variable is used by D3.js to visualize the desired data.

The `.eslintrc` file

The contents of .eslintrc are the following:

Next steps

Now that you know how to visualize live data from Kafka topics through Lenses and D3.js, you should begin visualizing data from your own Kafka topics.