Picking up the pieces of your monolith breakdown

A data observability must-have as you migrate to microservices.

Evan Rosenstein

Feb 26, 2021

Picking up the pieces of your monolith breakdown

A decade ago, all developers could talk about was breaking down the monolith and event-driven architectures. Especially in the financial services industry, to become more nimble and accelerate their application delivery. They leveraged messaging systems to decouple the application, and specifically Apache Kafka has transitioned from being a data integration technology to the leading messaging system for microservices.

Naturally as time passed, contrarians arose to pop the bubble of enthusiasm for dissecting monoliths; similarly, critics bashed Led Zeppelin once they had become the world’s greatest rock band.

These advocates argued breaking up the monolith was not worth the cost of the extra complexity and operating cost. After all, if you can't manage a monolith, what makes you think microservices will solve your woes?

Monolith to microservices: dazed & confused?

Who needs to migrate to microservices? This is a bigger discussion, but in short, you need to reach a critical mass of complexity before you start splitting and delegating tasks.

If you need to scale or plan to scale, having different product teams working off of a monolith gets messy. You form teams when the work becomes too much for one person; and a monolith blocks teams from functioning independently.

Migrating from a monolith to an event-driven architecture is an epic journey. Here in this article we skim over some of the big, foundational questions teams need to answer first, such as ‘What capabilities do I decouple, when?’ and ‘How do I migrate incrementally?’ in favor of two more specific questions that also slow down migrations if they’re left unanswered:

How do I migrate safely with full data observability into both my legacy systems and cloud applications running on Kafka?
How do we ensure productivity for developers building microservices
How do I govern my data fabric across my event-driven architecture once I get there? If you don't expose metadata & data and make it consumable for other teams then these business questions cannot be answered.

Technology specialization or data observability?

There is a major catch when it comes to overseeing a healthy migration. More data domains means more diverse problems, which requires more specialization. It is a lot easier to find support for an issue in a Oracle-based legacy monolith versus the marketing application built on Apache Spark, Cassandra and Apache Kafka.

For those turned off by these stipulations, remember why you chose to break down a monolith: to separate the app delivery cycles of your teams and innovate more easily with less risk. If you run a dev team releasing a front-end of an audio streaming service, for example, you can deploy without worrying about corrupting album artwork.

Luckily, you can break down a monolith, push innovation, and still keep high standards of reporting and collaboration.

The stairway to greater data products

A recent talk by Goldman Sachs at AWS ReInvent detailed how they are re-architecting their applications to be event-driven, decoupled through an MSK Kafka cluster. Each Goldman Sachs product team would have different tooling, practices and technology. If the CTO wanted to build a new service that overlaps the payment services and CRM projects, it would be a challenge to understand the data domain across the two different teams.

breaking down a monolith - data observability

The one piece in common is the data fabric in Kafka. To have your cake and eat it too, one must lean into DataOps to create a Data Mesh . If you're breaking up your application and engineering team into 7 pieces, great collaboration tooling will be needed to bring the data domains back together.

Beyond a surge in the number of discrete components you need to govern, there are a few other challenges and moving parts to monitor across microservices:

Infrastructure metrics, such as host CPU and memory
Container runtime and Kubernetes metrics
Application metrics, such as request rate and duration

Developers need to ensure that data can be consumed by different teams with the context to understand and operate streaming data flows and applications (potentially with varying levels of technical skill).

Lenses can help reduce the complexity in managing your data fabric when breaking down a monolith into microservices with Apache Kafka. One impactful way of interfacing with this data is through a data catalog, enabling developers to find, inventory and analyze distributed and diverse datasets across your streaming data and applications.

Metadata can then be consumed by different teams via this real-time data catalog almost as you would use Google Search, or peruse a digital library of all Jimmy Page’s guitar riffs to write something new. Here you can do the following:

Debug applications and inspecting message payload with SQL
Look up partitioning information
Oversee infrastructure health as you decouple and migrate services
View consumer lag
Explore business metrics across different applications to enrich or create new data products

Cataloguing data is a valuable way to simplify operations on the road from monolith to microservices.

Breaking down the monolith remains a necessity for a reason. Industries like banking and insurance are flooded with new, nimble competition. So stick with your convictions, innovate, and build data products, just be sure you prepare a data mesh to keep the application consumable across teams.

To bring this full circle, Stairway to Heaven remains one of the greatest songs of all time, no matter what Wayne’s World or any critic says.