• Pricing
  • Install Now
installNow icon
installNow icon
Install Now
homeMobile icon
homeMobile icon
Home
picingMobile icon
picingMobile icon
Pricing
blogMobile icon
blogMobile icon
Blog

The developer experience gap in Kafka data replication

Guillaume Aymé
By Guillaume AyméSeptember 4, 2025
DevX-Gap-in-Kafka-Replication-6
In this article:
  • 01.The data replication bottleneck
  • 02.Why now? Current solutions fall short
  • 03.The developer experience gap
  • 04.Rethinking data replication architecture
  • 05.The Path Forward

Read this blog to understand:

  • Why MirrorMaker2 and vendor-specific solutions for data replication create platform team bottlenecks
  • How Kubernetes-native data replication removes these friction points
  • What self-service data distribution looks like in practice.


The data replication bottleneck

When infrastructure limits innovation

An engineering team needs to test their fraud detection algorithm with production data in staging for 30 minutes. Simple request, right? Not when it requires a three-week project involving overworked platform teams, complex Kafka Connect configurations, and vendor-specific deployment pipelines.

This scenario plays out daily across organizations building real-time applications that need self-service Kafka data replication. 

The streaming industry has lived with a lack of good replication options for too long. Every business building data-driven products needs an efficient supply chain for moving real-time data, yet current solutions create more friction than they solve. 


Why now? Current solutions fall short

Technical debt, vendor lock-in


The Kafka replication landscape offers limited choices, and none of them are particularly appealing for modern engineering teams.

MirrorMaker2 dominates because it's the only vendor-agnostic option, but it's showing its age. Built on the Kafka Connect framework, it inherits all of Connect's widely-acknowledged problems. Key committers have openly admitted they would redevelop it from scratch if they could. The project lacks active maintenance, missing features that modern workloads require, and evolving it often means upgrading the entire Connect framework – adding complexity instead of reducing it.

The operational reality is harsh: MM2 deployments come with significant compute costs and operational overhead. There's no built-in developer experience, no governance layer, and configuration requires specialized knowledge that most teams don't have.

Confluent's alternatives – Replicator and Cluster Linking – solve some technical problems but create strategic ones. Cluster Linking only works for exact replication between Confluent clusters, but it's designed for disaster recovery, not data sharing. You can't filter, transform, or route subsets of data. Most critically, it will lock you into Confluent. 

Meanwhile, cloud-specific solutions like AWS MSK Replicator offer managed convenience but zero flexibility. Your target MSK cluster must be in the same AWS account, transformation capabilities don't exist, and you're locked into a single cloud provider's ecosystem.


The developer experience gap

Platform teams are unintentionally gatekeeping


Current replication solutions turn platform teams into bottlenecks instead of enablers. Replication has needed to be an infrastructure responsibility: it has required tight control and specialized skills. But the teams that need replication most are product and application developers.

Platform engineers spend time maintaining complex replication configurations instead of building capabilities that accelerate development teams. Every replication request becomes a project requiring specialized resources and lengthy approval processes.

Self-service becomes impossible when tools require deep streaming infrastructure expertise. Teams that should be iterating on algorithms and features instead wait for platform team availability. 


Rethinking data replication architecture

Kubernetes-native and vendor-agnostic


Modern Kafka replication needs to match how modern applications are built and deployed. This means Kubernetes-native design, vendor neutrality, and developer-friendly interfaces.

Kubernetes-native architecture eliminates dependency on Kafka Connect and its associated complexity. Applications can scale naturally with cluster resources, integrate with existing CI/CD pipelines, and leverage standard Kubernetes security and monitoring patterns. The compute cost reduction compared to Connect-based solutions is substantial; particularly important when replication workloads can consume significant infrastructure resources.

Your core IT might use Confluent, but your Manufacturing teams use AWS MSK and Data Science uses Strimzi, for example. Infrastructure should enable this flexibility rather than constrain it through vendor lock-in.

Schema Registry handling presents a unique challenge when copying data between clusters with different registry vendors (e.g. the source is Confluent and target is AWS Glue Schema). 

Smart data routing capabilities – topic renaming, merging, splitting – should be configuration, not custom development. Data filtering and masking become essential for compliance when sharing across environments or organizations. These should be declarative capabilities, not complex programming exercises.


The Path Forward

Self-service with enterprise governance


From our standpoint, the solution combines enterprise-grade capabilities with consumer-grade ease of use. 

  • Declarative configuration through YAML enables GitOps workflows that platform teams understand and trust. Changes flow through standard CI/CD processes. 
  • Cost control - Highly efficient with auto-scaling based on data volume keeps compute costs predictable and efficient.
  • Control plane & governance - offers developer self-service deployment & control, backed by RBAC and auditing
  • Observability - provides visibility into replication status, performance metrics, and resource utilization. Everything integrates with existing observability stacks.
  • Advanced features - including data masking when replicating and dynamic data routing. 
  • Most importantly, exactly-once delivery semantics ensure data integrity without requiring teams to understand the complexities of distributed systems guarantees. 
  • Consumer offset translation enables seamless failover scenarios for disaster recovery use cases.


This approach transforms replication from a specialized infrastructure service into a standard developer capability. Teams provision replication resources as easily as they provision compute resources, with the same self-service interfaces and operational patterns they already understand.

The goal: infrastructure that enables rather than constrains data-driven development, with enterprise governance that provides safety without sacrificing velocity.

Next up:

  • Guide | The Kafka replicator comparison guide
  • Tutorial | Set up the Lenses Kafka to Kafka replicator in alpha
  • Guide | How to migrate from AWS MSK to Express Brokers
Back to all blogs

Related Blogs

Lenses 6.2 Oauth
Lenses 6.2 Oauth
Blog

Lenses 6.2 - Trusting Agents to build & operate event-driven applications

andrew
andrew
By
Andrew Stevenson
image
image
Blog

Kafka Migrations Need More Than a Replicator

Jonas Best Profile Picture
Jonas Best Profile Picture
By
Jonas Best
kafkaconnections hero banner
kafkaconnections hero banner
Blog

Self-Service Data Replication with K2K - part 1

Drew Oetzel
Drew Oetzel
By
Drew Oetzel

Lenses, autonomy in data streaming

Install now
Products
Developer Experience
Kafka replicator
Lenses AI
Kafka Connectors
Pricing
Company
About
Careers
Contact
Solutions by industry
Financial services
For engineers
Docs
Ask Marios Discourse
Github
Slack
For executives
Case studies
Resources
Blog
Press room
Events
LinkedIn
Youtube
Legal
Terms
Privacy
Cookies
SLAs
EULA
© 2026Apache, Apache Kafka, Kafka and associated open source project names are trademarks of the Apache Software Foundation