New Lenses Multi-Kafka Developer Experience

Download

Stream Reactor released for Kafka 2.1.0

By

Andrew Stevenson

Jan 24, 2019

Stream Reactor, the largest open-source collection of Apache Kafka connectors, has released today with many new features, bug fixes and new connectors for Apache Hive!

Version 1.2.1 for Kafka Connect 2.1.0

A big thank you goes to the community who contributed to this release, not only with raising issues, but also with talking with the community and raising multiple pull requests (PRs). Lets have a closer look at some of the changes:

About StreamReactor

Stream Reactor is an Apache License, Version 2.0 open source collection of components built on top of Kafka and provides Kafka Connect compatible connectors to move data between Kafka and popular data stores. Stream Reactor provides source connectors to publish data into Kafka and sink connectors to bring data from Kafka into other systems. The connectors support KCQL (Kafka Connect Query Language), an open source component of Lenses SQL Engine that provides an elegant and simple SQL like syntax for selecting fields and routing from sources or topics to Kafka or the target system (topic to target entity mapping, field selection, auto creation, auto evolution, error policies).

We hope you find Stream Reactor useful, and want to give it a try! Stream Reactor has over 25 connectors available, tested and documented, supporting Kafka 2.1.0, you can give it a go by downloading the Kafka Development Environment or find the jars on GitHub, or even build the code locally and help us improve and add even more connectors.

New Connector: Hive source and sink

The new Hive source and sink build further on our list of supported connectors allowing data pipelines so be constructed quickly and easily end to end with SQL. The Connectors support included:

  • Flush size, interval and time for rolling files in HDFS

  • Partition strategies for optimizing queries and selection of fields to partition by Schema evolution polices

  • Auto creating of Hive tables

  • Overwriting options

  • Setting table locations in HDFS

  • Parquet and ORC file format support

These features are integrated into KCQL, our connector SQL, simplifying configuration and keep config bloat to a minimum.

HDFS is often used a persistent store, but sometimes you need to reload to replay data. The Hive source now makes this possible.

Fixes and Features

  • Fixed Set support on the Cassandra source connector

  • Support Array type in InfluxDB connector

  • Fixed records out of order when insert on the Kudu sink connector

  • Upgrade to Kafka 2.1.0

  • Added support for custom delimiter in composite primary keys on the Redis sink connector

  • New Hive source and sink connector supporting Avro, Parquet and ORC

  • Fix on NPE for Redis multiple sorted sets

  • Fixed setting MongoDB primary _id field in upsert mode

  • Fix on handling multiple topics in Redis sort set

  • Fixed MongoDB sink exception when PK is compound key

  • Fixed JMS sink with password is not working, wrong context

  • Fixed handling multiple primary keys for sorted sets

  • Fixed Kudu sink autocreate adding unnecessary partition

  • Fixed Avro field with default value does not create table in Kudu

  • Fixed Kudu Connector Can Not AutoCreate Table from Sink Record

  • Fixed JMS sink session rollback exception if session is closed

Learn more

Follow us on Twitter @lensesio, For tech talk, join our Slack community channel to talk with the community. If you find problems or have ideas on how we can improve Stream Reactor, please let us know or log an issue.