Monitoring as Code: Worth The Hype?
November 01, 2022

Hannes Lenke
Checkly

Share this

Configuring application Monitoring as Code (MaC) is the next logical step in modern software development. Today, configuring monitoring is often an overly manual process. It's a bottleneck that DevOps teams are addressing to ship code faster with greater confidence.

Before we explore the relatively new MaC concept, we should step back and discuss the "as Code" movement in general. The most prominent current example is Infrastructure as Code (IaC), which became the gold standard for infrastructure provisioning in recent years. IaC lets developers write files that define how servers should be set up. Building on that concept, IaC tools apply those configurations automatically, often fully integrated into the CI/CD process.

Bringing key aspects of the software development workflow closer to the application code enables developers to automate and ultimately ship their services faster and more often, continuously. Hence ‘as code' has become popular in recent years. However, continuous delivery (CD) requires more than infrastructure automation. It also requires automation of other software delivery aspects. Without this additional automation, how would DevOps teams be able to ship code updates dozens of times a day or even more often?

Next to automation, one key aspect of CD is that cross-functional DevOps teams are now responsible for their services from one end to the other. The motto "You build it; you test it; you run it!" rings true for teams not only tasked to ship often but to simultaneously test and operate those deployed services. It's vital for modern DevOps teams to embrace automation for other functions in their pipeline, including crucial aspects like monitoring. In that context, health and performance monitoring need to be described as code too.

Let's look at some key reasons why monitoring as code is here to stay.

Monitoring shouldn't become the bottleneck for software delivery

Creating checks for larger APIs or websites are often repetitive manual tasks that require a lot of time. In addition, the demand on DevOps teams to make daily — or even hourly — changes to target applications translates into exploding workloads and testing requirements.
In contrast, defining something as code enables you to replicate the actions you would usually do manually — using a UI or CLI — and automate these.

Lack of transparency makes cross-team collaboration harder

Traditional monitoring processes require manual provisioning, meaning users need to create tickets to have new monitoring resources provisioned for them or request permission to apply the changes themselves. In turn, central IT teams are often required to work through different UIs and flows.

This makes it difficult to maintain consistency across an entire infrastructure while simultaneously avoiding duplication of effort across teams. It also complicated the task of auditing changes, making it difficult to review wrongly configured monitoring checks, thereby lengthening an important feedback loop.

Monitoring should be CI/CD integrated

Eventually, the speed of checks-provisioning does not match the pace at which the target applications are evolving. This results from a mismatch of approaches: the CI/CD workflow through which the websites and APIs are iterated upon on one side vs. the fully manual approach on the other.

Applying lessons learned from IaC, MaC brings check definitions closer to the application's source code by having them written as code.

This method allows check definitions to live in source control, boosting cross-team visibility. Additionally, code is text, which is useful for version control and generating an audit trail of all changes. This makes it easier to roll back changes in case of incidents.

With software taking over the provisioning of monitoring checks, hundreds or thousands of checks can be created or edited in a matter of seconds. This is a game-changer for development, operations, and DevOps teams, allowing them to reallocate time spent on manual configuration toward improving the coverage and robustness of their monitoring setup.

To summarize, MaC is revolutionizing the way monitoring is configured by providing:

1. Better scalability through faster, more efficient provisioning

2. Increased transparency and easier rollbacks via source control

3. Unification of previously fragmented processes in a CI/CD workflow

Hannes Lenke is CEO and Co-Founder of Checkly
Share this

The Latest

March 18, 2024

Gartner has highlighted the top trends that will impact technology providers in 2024: Generative AI (GenAI) is dominating the technical and product agenda of nearly every tech provider ...

March 15, 2024

In MEAN TIME TO INSIGHT Episode 4 - Part 1, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at Enterprise Management Associates (EMA) discusses artificial intelligence and network management ...

March 14, 2024

The integration and maintenance of AI-enabled Software as a Service (SaaS) applications have emerged as pivotal points in enterprise AI implementation strategies, offering both significant challenges and promising benefits. Despite the enthusiasm surrounding AI's potential impact, the reality of its implementation presents hurdles. Currently, over 90% of enterprises are grappling with limitations in integrating AI into their tech stack ...

March 13, 2024

In the intricate landscape of IT infrastructure, one critical component often relegated to the back burner is Active Directory (AD) forest recovery — an oversight with costly consequences ...

March 12, 2024

eBPF is a technology that allows users to run custom programs inside the Linux kernel, which changes the behavior of the kernel and makes execution up to 10x faster(link is external) and more efficient for key parts of what makes our computing lives work. That includes observability, networking and security ...

March 11, 2024

Data mesh, an increasingly important decentralized approach to data architecture and organizational design, focuses on treating data as a product, emphasizing domain-oriented data ownership, self-service tools and federated governance. The 2024 State of the Data Lakehouse report from Dremio presents evidence of the growing adoption of data mesh architectures in enterprises ... The report highlights that the drive towards data mesh is increasingly becoming a business strategy to enhance agility and speed in problem-solving and innovation ...

March 07, 2024
In this digital era, consumers prefer a seamless user experience, and here, the significance of performance testing cannot be overstated. Application performance testing is essential in ensuring that your software products, websites, or other related systems operate seamlessly under varying conditions. However, the cost of poor performance extends beyond technical glitches and slow load times; it can directly affect customer satisfaction and brand reputation. Understand the tangible and intangible consequences of poor application performance and how it can affect your business ...
March 06, 2024

Too much traffic can crash a website ... That stampede of traffic is even more horrifying when it's part of a malicious denial of service attack ... These attacks are becoming more common, more sophisticated and increasingly tied to ransomware-style demands. So it's no wonder that the threat of DDoS remains one of the many things that keep IT and marketing leaders up at night ...

March 05, 2024

Today, applications serve as the backbone of businesses, and therefore, ensuring optimal performance has never been more critical. This is where application performance monitoring (APM) emerges as an indispensable tool, empowering organizations to safeguard their applications proactively, match user expectations, and drive growth. But APM is not without its challenges. Choosing to implement APM is a path that's not easily realized, even if it offers great benefits. This blog deals with the potential hurdles that may manifest when you actualize your APM strategy in your IT application environment ...

March 04, 2024

This year's Super Bowl drew in viewership of nearly 124 million viewers and made history as the most-watched live broadcast event since the 1969 moon landing. To support this spike in viewership, streaming companies like YouTube TV, Hulu and Paramount+ began preparing their IT infrastructure months in advance to ensure an exceptional viewer experience without outages or major interruptions. New Relic conducted a survey to understand the importance of a seamless viewing experience and the impact of outages during major streaming events such as the Super Bowl ...