The Hidden Complexity of MirrorMaker 2

Offsets: Replication vs Translation

Drew Oetzel

Oct 09, 2025

Disaster recovery with Kafka. We plan for it. We set up MirrorMaker2 to replicate our data for it. It’s always on our minds, and we sometimes even remember to test it. Mostly it just happens to us on some terrible day. The scenario is always the same: the primary cluster goes down, you switch traffic to your secondary, and your consumers either start reprocessing messages from three days ago or skip ahead and miss thousands of messages entirely. Everyone stares at the monitoring dashboards trying to figure out what went wrong.

The problem is almost always offsets. More specifically, it's the difference between offset replication and offset translations. Two concepts that sound similar but work very differently under the hood.

If you're setting up MirrorMaker 2 for disaster recovery, understanding this distinction is clutch. Get it wrong, and your failover process will be painful. Get it right, and you'll save yourself hours of troubleshooting and potential data issues.

The Two Concepts: Replication vs Translation

Let's start with the basic difference.

Offset replication is simple. MM2 copies consumer group offsets from the source cluster to the target cluster. It reads the __consumer_offsets topic on the source and writes equivalent records to __consumer_offsets on the target. When you enable this:

You're telling MM2: "Every 30 seconds, copy the consumer group offsets from source to target." Simple enough.

Offset translation is more sophisticated. It's the process of converting an offset from the source cluster into the correct equivalent offset on the target cluster, accounting for replication lag and the fact that the offset numbers might not match up between clusters.

The thing you have to understand is this: replication isn't instantaneous. At any given moment, your target cluster is slightly behind your source cluster. The messages exist on both clusters, but they might have different offset numbers.

Why Offset Numbers Differ

Here's an example of the problem.

Source cluster - topic orders, partition 0 at 10:00 AM:

Offset 1000: message A (timestamp: 09:00)
Offset 1001: message B (timestamp: 09:15)
Offset 1002: message C (timestamp: 09:30)
Offset 1003: message D (timestamp: 09:45)
Offset 1004: message E (timestamp: 10:00)
Your consumer has processed through offset 4

Target cluster - topic orders, partition 0 at 10:00 AM (with replication lag):

Offset 1000: message A (timestamp: 09:00)
Offset 1001: message B (timestamp: 09:15)
Offset 1002: message C (timestamp: 09:30)
Offset 100 3: message D (timestamp: 09:45)
Message E hasn't arrived yet due to replication lag

Your consumer on the source is at offset 1004 (it's processed message E). If MM2 just copies that offset number to the target without any adjustment, the consumer would try to read from offset 4 on the target... which doesn't exist yet.

Even worse, when offset 4 eventually appears on the target, it might not be message E. If there was any turbulence in the replication process—retries, network issues, message reordering—the offset numbers could be skewed.

This is where translation comes in. MM2 looks at its checkpoint data and determines: "The consumer has processed message D on the source. Message D on the target is at offset 3. So the translated offset should be 3, not 4."

How MM2 Handles Translation

MM2 maintains internal checkpoint topics that track the mapping between source and target offsets. These checkpoints are created by the MirrorCheckpointConnector and record information like:

This checkpoint says: "At 10:30 AM, source offset 1000 in the orders topic corresponded to target offset 998 in the primary.orders topic."

When a consumer fails over to the target cluster, MM2 uses these checkpoints to translate offsets:

Consumer was at offset 1000 on source
Look up checkpoint: source offset 1000 maps to target offset 998
Set consumer to offset 998 on target
Consumer continues from the correct logical position

The checkpoint topics are named using the replication policy. With DefaultReplicationPolicy, they're named like primary.checkpoints.internal. The cluster prefix is crucial—it tells MM2 which replication flow these checkpoints belong to.

The IdentityReplicationPolicy Problem

Here's where things get tricky for disaster recovery setups. Most DR scenarios want topic names to stay identical between clusters. You don't want to reconfigure every application to consume from primary.orders instead of orders. So you use IdentityReplicationPolicy:

This keeps topic names identical. But it breaks offset translation. Here's why:

With DefaultReplicationPolicy:

Source topic: orders
Target topic: primary.orders (prefixed)
Checkpoint topic: primary.checkpoints.internal
Everything is clearly labeled—MM2 knows which cluster and flow these checkpoints belong to

With IdentityReplicationPolicy:

Source topic: orders
Target topic: orders (same name)
Checkpoint topic: checkpoints.internal (no prefix)
Ambiguity: which cluster do these checkpoints describe?

If you have bidirectional replication (or might add it later), the problem gets worse. Are the checkpoints in checkpoints.internal for primary→secondary replication or secondary→primary replication? MM2 can't tell because there's no prefix to distinguish them.

The checkpoint system gets confused, and offset translation stops working reliably.

What Actually Happens Without Translation

When you use IdentityReplicationPolicy with offset sync enabled, you get offset replication but not offset translation. Here's what that looks like:

At 10:00 AM:

Source: Consumer group is at offset 5000 for orders partition 0
Target: Replication has reached offset 4950 (50 messages behind due to lag)
MM2 replicates the offset as-is: "Consumer group is at offset 5000"

Note that MM2 writes the exact same offset number (5000) to the target. It doesn't translate it to 4950 (where the target actually is).

At 10:02 AM - primary cluster fails:

Consumer fails over to secondary cluster
Looks up its consumer group offset: 5000
Tries to read from offset 5000
But secondary only has messages up to offset 4950
Consumer behavior depends on your configuration:
- It might wait for offset 5000 to become available
- It might skip to the end (if auto.offset.reset=latest)
- It might throw an error

Once replication fully catches up and offset 5000 exists on the secondary, your consumer will continue from there. But you've lost messages 4951-5000. They were produced to the primary, your consumer's offset says it processed them, but your consumer never actually saw them after failover.

Alternatively, if your consumer has auto.offset.reset=earliest configured and the offset is out of range, it might restart from the beginning of the topic, reprocessing everything.

Practical Solutions for DR Scenarios

Let's talk about how to actually handle this in production. You have several options, each with tradeoffs.

Option 1: Accept Offset Replication Without Translation

For most disaster recovery scenarios, this is the pragmatic choice:

This works acceptably if:

Your consumers are idempotent (can safely reprocess messages)
Some message loss or duplication during failover is acceptable
You're doing disaster recovery, not active-active replication
Failover is a rare event, not a routine operation

The key is setting expectations correctly. You will likely lose or reprocess some messages during failover. The amount depends on your replication lag and how frequently you sync offsets. With sync.group.offsets.interval.seconds = 30, you're looking at potentially 30 seconds worth of messages in limbo.

For true disaster recovery—where the primary datacenter is on fire—this tradeoff is usually acceptable. You're recovering from a catastrophic failure. Perfect continuity isn't realistic.

Option 2: Custom Replication Policy

If you need proper offset translation but want to keep topic names identical, you can write a custom replication policy that treats application topics and internal topics differently:

Then use it in your configuration:

This gives you the best of both worlds:

Application topics keep their original names
Internal MM2 topics get prefixed
Offset translation works correctly
Consumers don't need reconfiguration during failover

The downside is maintaining custom code. You'll need to compile this, deploy it to your MM2 workers, and keep it updated as MM2 evolves.

Option 3: Use DefaultReplicationPolicy and Handle It in Applications

Instead of fighting MM2's design, embrace it:

Then handle the topic naming in your application code:

During failover, you update the environment variable from "primary." to "" (empty string for the secondary cluster, which doesn't have its own prefix).

This is architecturally cleaner because you're not fighting MM2's intended design. The offset translation works perfectly. But it requires application changes and coordination during failover.

Option 4: Separate MM2 Instances

Run two MM2 deployments—one for topics, one for offsets:

This is complex but gives you complete control. Your application topics stay unprefixed, but offset translation works because the second instance uses DefaultReplicationPolicy.

The problem: you're now managing two MM2 deployments, which doubles your operational overhead.

My Recommendation

For most teams setting up disaster recovery with MirrorMaker 2, I recommend Option 1—accepting offset replication without perfect translation. Here's why:

Simplicity matters. No custom code, no dual deployments, no application changes. You configure MM2, it replicates your data, and failover mostly works.

DR isn't meant to be seamless. You're recovering from a major failure. Some message reprocessing or loss is acceptable in that context. If you need zero-downtime failover with perfect continuity, you probably want active-active replication with client-side routing, not active-passive DR.

The offset sync still helps enormously. Even without perfect translation, having recent offsets on your secondary cluster is vastly better than starting from scratch. Your consumers will be roughly in the right place. They might reprocess 30-60 seconds of messages or skip that window, but they won't reprocess three days of data.

Make your consumers idempotent anyway. Even with perfect offset translation, you should design your consumers to handle duplicate messages. Network issues, broker failures, and rebalances can all cause redelivery. If your consumers are already idempotent, the offset translation issue becomes much less critical.

If you absolutely need perfect offset translation - maybe you're doing financial transactions or have strict regulatory requirements - then invest in Option 2 (custom replication policy) or Option 3 (DefaultReplicationPolicy with application-level handling). But for most use cases, the operational simplicity of Option 1 outweighs the edge cases where translation would help.

Monitoring and Testing

Regardless of which approach you choose, test your failover process before you need it:

Set up MM2 in a staging environment
Produce messages to your primary cluster
Consume those messages with a test consumer
Note the consumer's offset position
Simulate a failure by stopping producers and the primary cluster
Fail your consumer over to the secondary cluster
Check: did it resume from the right position? Did it skip messages? Did it reprocess?

Run this drill quarterly. Kafka versions change, MM2 gets updates, and configurations drift. The only way to know your failover will work is to actually test it.

Monitor MM2's internal topics as well. If checkpoints.internal or mm2-offset-syncs.secondary.internal stop receiving updates, something is broken. Set up alerts on these topics so you catch issues before failover happens.

And finally, document your choices. When you're setting up MM2, write down why you chose IdentityReplicationPolicy, why you set offset sync to 30 seconds, and what the expected behavior is during failover. Six months from now, someone (probably you) will be debugging a failover issue, and that documentation will be invaluable.

MirrorMaker 2 is a powerful tool, but offset management is one of its more nuanced features. Understanding the difference between replication and translation, and knowing when you need which, will save you from painful surprises during what's already a stressful situation.