Stream Reactor, the largest open-source collection of Apache Kafka connectors, has released today with many new features, bug fixes and new connectors for Apache Hive!
Version 1.2.1 for Kafka Connect 2.1.0
A big thank you goes to the community who contributed to this release, not only with raising issues, but also with talking with the community and raising multiple pull requests (PRs). Lets have a closer look at some of the changes:
Stream Reactor is an Apache License, Version 2.0
open source collection of components built on top of Kafka and
provides Kafka Connect compatible connectors to move data between Kafka and popular data stores. Stream Reactor
provides source connectors
to publish data into Kafka and sink connectors
to bring data from Kafka into other
systems. The connectors support KCQL (Kafka Connect Query Language),
an open source component of Lenses SQL Engine
that provides an elegant and simple SQL like syntax for selecting fields and routing
from sources or topics to Kafka or the target system (topic to target entity mapping,
field selection, auto creation, auto evolution, error policies).
We hope you find Stream Reactor useful, and want to give it a try! Stream Reactor has over 25 connectors available, tested and documented, supporting Kafka 2.1.0, you can give it a go by downloading the Kafka Development Environment or find the jars on GitHub, or even build the code locally and help us improve and add even more connectors.
The new Hive source and sink build further on our list of supported connectors allowing data pipelines so be constructed quickly and easily end to end with SQL. The Connectors support included:
Flush size, interval and time for rolling files in HDFS
Partition strategies for optimizing queries and selection of fields to partition by Schema evolution polices
Auto creating of Hive tables
Overwriting options
Setting table locations in HDFS
Parquet and ORC file format support
These features are integrated into KCQL, our connector SQL, simplifying configuration and keep config bloat to a minimum.
HDFS is often used a persistent store, but sometimes you need to reload to replay data. The Hive source now makes this possible.
Fixed Set support on the Cassandra source connector
Support Array type in InfluxDB connector
Fixed records out of order when insert on the Kudu sink connector
Upgrade to Kafka 2.1.0
Added support for custom delimiter in composite primary keys on the Redis sink connector
New Hive source and sink connector supporting Avro, Parquet and ORC
Fix on NPE for Redis multiple sorted sets
Fixed setting MongoDB primary _id field in upsert mode
Fix on handling multiple topics in Redis sort set
Fixed MongoDB sink exception when PK is compound key
Fixed JMS sink with password is not working, wrong context
Fixed handling multiple primary keys for sorted sets
Fixed Kudu sink autocreate adding unnecessary partition
Fixed Avro field with default value does not create table in Kudu
Fixed Kudu Connector Can Not AutoCreate Table from Sink Record
Fixed JMS sink session rollback exception if session is closed
Follow us on Twitter @lensesio, For tech talk, join our Slack community channel to talk with the community. If you find problems or have ideas on how we can improve Stream Reactor, please let us know or log an issue.