Apache Kafka has grown from an obscure open-source project to a mass-adopted streaming technology, supporting all kinds of organizations and use cases.
Many began their Apache Kafka journey to feed a data warehouse for analytics. Then moved to building event-driven applications, breaking down entire monoliths.
Now, we move to the next chapter. Joining Celonis means we’re pleased to open up the possibility of real-time process mining and business execution with Kafka.
The Celonis Execution Management System (EMS) enables organizations to execute on their business data. The Celonis EMS turns process data into intelligent actions by allowing customers to visualize their business processes, understand where there are gaps and inefficiencies and automate corrective action without human intervention.
Traditionally, organizations have been dependent on transactional data such as those held in ERPs, CRMs and other cloud business services to understand their business and act accordingly.
But today we live in a world of events. And whereas transactions are useful, much can happen between these transactions. Whole processes are now running outside these traditional transactional systems.
This new Celonis Kafka Connector co-developed by Lenses.io and Celonis engineering teams opens up new execution management possibilities and improves the data integration practices for data streaming into the EMS:
Better data integration architecture: Use Kafka to avoid point-to-point connections between EMS and other systems.
All in real-time: Collect data from transactional systems such as SAP or SalesForce in real-time as a stream rather than as batches. Allow action to be taken before it’s too late.
New data sources: Connect completely new forms of data from your custom microservices representing digital business, for example streaming data from IoT devices or 3rd party feeds like weather, traffic and market data.
The connector works on the Kafka Connect framework, in the same way our dozens of Lenses Kafka connectors do.
The connector maximizes the value your organization generates from data by making it execute for you.
This data can support business execution in a number of ways.
Real-time contextual data: such as traffic information from a 3rd party source. Example: With a live 3rd party traffic feed, improve corrective action of an order being at risk of late delivery by knowing that supplies are impacted, splitting the order and shipping what products are in stock.
Real-time transactional data: such as data from Salesforce or SAP, but now available as a real-time stream rather than batch.
Real-time data from event-driven applications: If your organization is already using Kafka to build event-driven applications, integrate this data into EMS to understand new business processes that span your digital services.
For example as an airline, collect sensor data from catering, refuelling, waste disposal and other services to optimize processes that lead to shorter turnaround time between flights.
Excited as we are with these use cases? Interested in learning more and trying out the connector? Request access to the connector.
Documentation can be found in Github, or you can point your process and execution management colleagues to the Celonis Kafka Connector page to understand the connector's business value.
Documentation can be found in Github.
The connector is free.
It’s not essential, no. The connector works just like a standard Kafka Connect connector. However, deploying and managing the connector and data through Lenses will significantly reduce the complexity of building data integration flows.
Obtain the relevant endpoint for your realm and the API Authorization key from your Celonis EMS account.
You can spin up multiple instances of the Connector (sink). Each instance can read from one or more Kafka topics, and publish to a given target table in EMS. To send data to more than one target table, you will need one connector per target.
The EMS Continuous Data API expects parquet files. The connector accumulates files locally for each topic-partition, to reduce network use. Each file then holds records from the stream. Based on the policies the files are pushed to the API. An EMS job will then update the PQL.
If the table does not exist in the EMS, it will be created with the first file being processed on the EMS side. It will use the Parquet file schema to create the table. If the connector is pushing to an existing table, you need to make sure that the schema is matching. In this version there is no validation done by the connector.
There are 3 criteria to trigger to push data:
Parquet file size,
Number of records in file
Time since last write.
Once a topic-partition record is written, the connector checks if any of the criteria is met and uploads the file.
Kafka Connect includes error handling options and the ability to route messages to a dead letter queue in the case of a message that cannot be processed. The Connector also supports error policies while inserting data and logs errors accordingly.
Converters are necessary to have a Kafka Connect deployment support a particular data format when writing to or reading from Kafka. Connectors use converters to change the format of data from bytes to a Connect internal data format and vice versa.
The Connector can be configured with Single Message Transformations to make lightweight modifications to individual messages as they flow through Kafka Connect. SMTs allow users to change the data shape for example by adding or removing fields, or moving them between the key and the value.
When Kafka topics can evolve their schema the Connector rolls the accumulated file over and creates a new file with the new schema which will be uploaded to the EMS. The Connector will not evolve the schema within the EMS.
Get started by requesting the Kafka to Celonis connector.