• Pricing
  • Install Now
installNow icon
installNow icon
Install Now
homeMobile icon
homeMobile icon
Home
picingMobile icon
picingMobile icon
Pricing
blogMobile icon
blogMobile icon
Blog

Fat Fingers? Not With Our K2K Config Schema Protector!

Ivan Majnarić
By Ivan MajnarićOctober 20, 2025
JSON SCHEMA PROTECTOR
In this article:
  • 01.The 3 AM Incident
  • 02.What is JSON Schema?
  • 03.The Kafka2Kafka Configuration Challenge
  • 04.Building Our Schema Shield
  • 05.The Power of Strict Validation
  • 06.Real-World Impact
  • 07.How to Use JSON Schema in Your Project
  • 08.Conclusion

The 3 AM Incident

Picture this: It's 3 AM. You’re on-duty in case there is an outage. A team in the other part of the world merged PR and released a new version of K2K Replicator and it crashed. 

Consumer group lag is spiking to the universe. You’re paged & woken up, went to your laptop, the team already reverted PR, things are stabilising, but what really happened, you have to investigate now as postmortem has to be done. 

On the first glance everything looks normal, the team introduced new property max.poll.record, nothing wrong in that, that shouldn’t crash the deployment. Until, your brain finally awakes after half an hour and figures out, it shouldn’t be max.poll.record but max.poll.records. Well, that’s half an hour of sleep and one postmortem down the drain!

Sound familiar?

Configuration errors are the silent killers of distributed systems. They slip through code reviews, bypass unit tests, and wait patiently in production to ruin your day. But there's a hero in this story: JSON Schema.

What is JSON Schema?

JSON Schema is like a bouncer at an exclusive club. It checks your configuration at the door and rejects anything that doesn't belong. Think of it as a contract that defines:

  • What properties are required (eg. did you forget the bootstrap servers?)

  • What values are valid (acks can be 0, 1, or "all", not "banana")

  • What types are expected (integers vs strings)

  • What properties even exist (No, max.poll.record is not a thing)

In essence, JSON Schema validates your configuration before it reaches your application, catching errors at design time instead of runtime.

If you are interested in more, you can read it in the start of our small series about Applications configuration with JSON Schemas.

The Kafka2Kafka Configuration Challenge

We want to provide our users an as fluid as possible experience. One way is through a JSON Schema which offers auto completion / suggestion and validation checks when creating K2K pipelines in your favourite IDE. 

In the latest blog I talked about how to migrate AWS MSK to Express Brokers with Lenses K2K Replicator. In order to expand the story, we will re-use some of the examples from there.

Let's look at a real-world example: configuring a Kafka2Kafka (K2K) replication pipeline. 

Here's a snippet from our k2k.yaml:

name: "pipeline-k2k-demo"
license:
  acceptEula: true
  token: "----"
features:
  autoCreateTopics: enabled
  autoCreateControlTopics: enabled
source:
  kafka:
    common:
      bootstrap.servers: "boot-sg0feocb.c3.kafka-serverless.eu-west-1.amazonaws.com:9098"
      sasl.mechanism: "AWS_MSK_IAM"
      security.protocol: "SASL_SSL"
      sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;   
      sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
    consumer:
      group.id: k2k.eot
      max.poll.records: 50000
target:
  kafka:
    common:
      "bootstrap.servers": "kafka:9098"
    producer:
      buffer.memory: 124288000
      batch.size: 500000
      linger.ms: 0
      "acks": "1"

This configuration has over 30 configurable properties across multiple sections. Each section (source/target, consumer/producer) accepts different Kafka configuration properties. The producer alone has 88+ valid configuration options.

How do you ensure you're not mixing consumer properties in the producer section? Or accidentally using a property that doesn't exist?

Enter JSON Schema.

Building Our Schema Shield

Our json-schemas act as a comprehensive validation layer. Download it from our Git repo. 

Now, let's see how it protects us from common mistakes.

Example 1: Catching Typos


Bad Configuration:

source:
  kafka:
    consumer:
      "group.id": "k2k.eot"
      "max.poll.record": "50000"  # Missing 's' at the end!

Schema Validation Result:

Json Schema 2

The schema knows every valid consumer property. It has exactly 98 consumer configuration properties defined, and max.poll.record isn't one of them. The correct property is max.poll.records.

Example 2: Wrong Property in Wrong Place 

Bad Configuration:

target:
  kafka:
    producer:
      "buffer.memory": "124288000"
      "group.id": "my-group"  # This is a CONSUMER property!

Schema Validation Result:

Schema Validation

Our schema has "additionalProperties" set to false for both consumer and producer sections. This means no properties can be set beyond:

  • Producer section: The 88 specific allowed producer properties

  • Consumer section: The 98 specific allowed consumer properties

No mixing. No confusion.

Example 3: Invalid Enum Values 

Bad Configuration:

source:
  kafka:
    consumer:
      "auto.offset.reset": "beginning"  # Wrong value!

Schema Validation Result:

Json Schema example 3

The schema knows that auto.offset.reset only accepts three specific values: "beginning" sounds right, but it's not in the Kafka spec, the correct value is "earliest".

Example 4: Missing Required Fields 

Bad Configuration:

name: "my-pipeline"
source:
  kafka:
    common:
      "bootstrap.servers": "localhost:9092"
    # Missing required 'consumer' section!

Schema Validation Result:

Json Schema example 4

The schema enforces that certain properties are mandatory:

  • name (pipeline name)

  • source.kafka.common.bootstrap.servers

  • source.kafka.consumer.group.id

  • target.kafka.common.bootstrap.servers

  • replication array with source and sink definitions.

Example 5: Type Mismatches

Bad Configuration:

target:
  kafka:
    producer:
      "batch.size": "five hundred thousand"  # Should be a number!

Schema Validation Result:

json schema example 5

Most Kafka properties accept either string or integer representations ("500000" or 500000), but they must be valid numeric values.

Example 6: Perfect Configuration

Good Configuration:

name: "pipeline-k2k-demo"
license:
 acceptEula: true
 token: "test"
features:
 autoCreateTopics: enabled
 autoCreateControlTopics: enabled
 exactlyOnce: disabled
 tracingHeaders:
   partition: disabled
source:
 kafka:
   common:
     bootstrap.servers: "msk-serverless:9098"
     sasl.mechanism: "AWS_MSK_IAM"
     sasl.jaas.config: "software.amazon.msk.auth.iam.IAMLoginModule required awsProfileName=msk-sa;"
     security.protocol: "SASL_SSL"
     sasl.client.callback.handler.class: "software.amazon.msk.auth.iam.IAMClientCallbackHandler"
   consumer:
     group.id: "k2k.eot"
     max.poll.records: 50000
     max.partition.fetch.bytes: 33554432
     fetch.max.bytes: 102428800
     receive.buffer.bytes: 262144
     heartbeat.interval.ms: 6000
     session.timeout.ms: 20000
 registry:
   config:
     schema.registry.url: "http://external-schema:8081"
target:
 kafka:
   common:
     bootstrap.servers: "aws-expressbrokers:9098"
     sasl.mechanism: "AWS_MSK_IAM"
     sasl.jaas.config: "software.amazon.msk.auth.iam.IAMLoginModule required awsProfileName=msk-sa;"
     security.protocol: "SASL_SSL"
     sasl.client.callback.handler.class: "software.amazon.msk.auth.iam.IAMClientCallbackHandler"
   producer:
     buffer.memory: 124288000
     batch.size: 5000
     linger.ms: 0
     acks: "1"
     max.request.size: 2000000
     compression.type: "none"
     retries: 2222222
 registry:
   config:
     "schema.registry.url": "https://external-schema:8081"
topicCreation:
 replication:
   common:
     replication: 3
     config:
       "max.message.bytes": "20000000"
 control:
   common:
     replication: 3
replication:
 - source:
     name: source
     topic:
       - "props"
 - sink:
     name: "target"
     topic:
       prefix: "k2k.eot."
     partition: source

Schema Validation Result:

✅ SUCCESS: Configuration is valid!

Every property is spelled correctly, in the right section, with valid values. No typos. No misplacements. No surprises at 3 AM.

The Power of Strict Validation

Here's what makes our schema powerful:

1. Comprehensive Property Whitelisting

We've explicitly defined every valid property for:

  • 88 producer properties: From acks to transactional.id

  • 98 consumer properties: From group.id to max.poll.records

  • All SSL/TLS properties for secure connections

  • All SASL properties for authentication (Kerberos, OAuth, AWS IAM)

  • All metadata, metrics, and recovery properties

2. Zero Tolerance for Extras

With "additionalProperties": false, the schema rejects:

  • Typos (eg. max.poll.record )

  • Made-up properties (eg. super.fast.mode)

  • Properties in wrong sections (eg. consumer property in producer section)

3. Smart Type Validation

"batch.size": {
  "type": ["integer"],
  "description": "Producer will attempt to batch records together"
}

This accepts both "500000" (string) and 500000 (integer), matching Kafka's flexible configuration style.

4. Enum Constraints

"compression.type": {
  "type": "string",
  "enum": ["none", "gzip", "snappy", "lz4", "zstd"]
}

Only valid compression types are accepted. No guessing. No "what compression types does Kafka support again?"

Real-World Impact

Before JSON Schema validation, our team experienced:

  • 2-3 hours average debugging time for configuration errors

  • Configuration bugs making it to production monthly

  • Stress and anxiety during deployments

After implementing strict JSON Schema validation:

  • Configuration errors caught at commit time

  • Zero production incidents from typos or invalid configs

  • Faster onboarding for new team members (schema serves as documentation)

  • IDE autocomplete when using schema-aware editors

How to Use JSON Schema in Your Project

1. Define Your Schema

Create a config-schema.json that describes your configuration structure:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": ["name", "source", "target"],
  "properties": {
    "name": {
      "type": "string",
      "minLength": 1
    }
  },
  "additionalProperties": false
}

Enable IDE Support

Add a reference in your YAML/JSON files. For VSCode:

1.- Create .vscode folder within your workspace alongside file .vscode/setting.json

2.- Inside settings.json place

{
"yaml.schemas": {
 "./config-schema.json": ["*.yaml", "*.yml"]
  }
}

3.-Enjoy VSCode autocomplete!

Enjoy vs code

Conclusion

Configuration errors are preventable. With JSON Schema, you create a safety net that catches typos, validates values, and enforces structure before your application ever runs.

Our Kafka2Kafka pipeline went from "hope and pray" configuration to bulletproof validation with:

  • 186 explicitly defined properties (88 producer + 98 consumer)

  • Zero tolerance for invalid properties

  • Type-safe value validation

  • Required field enforcement

The next time you're tempted to write a configuration file without a schema, remember: it's 3 AM somewhere, and a typo is waiting to wake you up.

Don't let fat fingers ruin your day. Let JSON Schema be your protector.

How to use your new shiny JSON schema in your CI/CD pipeline? Stay tuned, it’s coming soon!


Back to all blogs

Related Blogs

Lenses 6.2 Oauth
Lenses 6.2 Oauth
Blog

Lenses 6.2 - Trusting Agents to build & operate event-driven applications

andrew
andrew
By
Andrew Stevenson
image
image
Blog

Kafka Migrations Need More Than a Replicator

Jonas Best Profile Picture
Jonas Best Profile Picture
By
Jonas Best
kafkaconnections hero banner
kafkaconnections hero banner
Blog

Self-Service Data Replication with K2K - part 1

Drew Oetzel
Drew Oetzel
By
Drew Oetzel

Lenses, autonomy in data streaming

Install now
Products
Developer Experience
Kafka replicator
Lenses AI
Kafka Connectors
Pricing
Company
About
Careers
Contact
Solutions by industry
Financial services
For engineers
Docs
Ask Marios Discourse
Github
Slack
For executives
Case studies
Resources
Blog
Press room
Events
LinkedIn
Youtube
Legal
Terms
Privacy
Cookies
SLAs
EULA
© 2026Apache, Apache Kafka, Kafka and associated open source project names are trademarks of the Apache Software Foundation