Skip to main content

Skipping Application Monitoring is the Biggest Anti-Pattern in Application Observability

Chris Farrell

Anti-patterns involve realizing a problem and implementing a non-optimal solution that is broadly embraced as the go-to method for solving that problem. This solution sounds good in theory, but for one reason or another it is not the best means of solving the problem.

A common example of this involves gasoline and rising prices. As prices go up, consumers tend to avoid getting gas as long as possible, until they are running on fumes. In reality, the best way to save money during this time would be to fill up your tank every chance you get.

Anti-patterns are common across IT as well, especially around application monitoring and observability. One that is particularly prevalent is in response to the increasing complexity of cloud-native infrastructure and applications. The [suboptimal] idea is that the best way to monitor modern applications is to not install monitoring, but rather have developers manually code in their own monitoring capabilities, put all the data into logs, and solve problems by analyzing custom dashboards and the resulting log files.

The reality is that this concept tends to lead to a multitude of visibility gaps, and can even send SWAT teams down the wrong path, depending on what's instrumented, collected and shown. The worst case would be application slow-downs, or even outages, occurring — all while the dashboards show "all systems green."

The problem with anti-patterns is that a popular idea can gain ground, even if the solution is suboptimal. For the afore-mentioned gasoline issue, it might take some math on a napkin to show how a different process can save money. For IT monitoring strategies, it might take a little bit more. To understand when a specific solution or process is an anti-pattern — and how to solve the problem in a more optimal way, it's important to recognize what led to the situation, the ultimate goal, and then open up to different solutions.

What Caused the Application Monitoring Anti-Pattern?

In the case of cloud-native application performance, the problem is that legacy application monitoring tools, which require continuous configuration and even some manual coding to reach their full value proposition, can lead to slow-downs in the DevOps and continuous integration / continuous deployment (CI/CD) process by requiring reconfiguration every time an update is released. There's always a chance that if the new reconfiguration isn't done (and done right), that the tool will not have the right data to either recognize a problem or solve it.

This is what has led many to eschew the idea of a monitoring tool and, instead, have their developers instrument monitoring into the code and simply analyze everything in logs themselves. Ultimately, they recognize the time consuming and menial work log analysis is, but it's seen as the lesser of two evils when compared to constant reconfiguration of monitoring.

But this isn't exactly optimal, itself. If the developers don't capture the right information at the right time, then the log analysis strategy is just as iffy as an unconfigured APM tool. Meanwhile, the only way to understand how any two pieces fit together is to bring the entire team into the analysis phase, which probably means even bigger bridge calls than with just the APM swat team approach.

Finding A Better Solution

As with any anti-pattern, including our real-world example above, the way to find an optimal solution is to start with the goal and make sure you're working towards that goal. In the gasoline example, people generally equate less frequent purchases as spending less, but if they instead focus on the actual cost itself, they can recognize an alternative that better achieves their goal of minimizing costs.

The same is true in application monitoring. The goal is to get the most immediate feedback on any software update, to proactively understand when a problem is occurring and easily, and quickly, solve the problem.

IT teams know that they want:

■ Monitoring up and down the cloud-native stack

■ Understanding within monitoring when changes occur

■ Access to data (and understanding) from a broader set of stakeholders

Certainly, the idea of developers coding, monitoring, and tracing, coupled with direct log analysis by every stakeholder, meets the above — but does it truly achieve the ultimate goals of Dev+Ops when it comes to operating their applications?

Let's tackle the problems and misconceptions of this observability anti-pattern:

Configuring monitoring is hard — no one wants to spend the time or investment needed to even get going with a monitoring tool.

We agree, it can be hard. But there are monitoring and observability solutions that automate the hard part (we promise, they exist). You shouldn't avoid the idea of monitoring because of the traditional hurdles involved in setting this up.

We can provide data for everyone to use! No observability tool needed. What does providing a firehose of all data to all users create? A lot of time wasting, inefficiency, and non-focused analysis.

The problem here is: If you provide all the data to a user, it will take forever to sort through what is relevant to them. Or, if you provide only the specific data related to the application they care about for example, they won't have the context needed to fully understand the situation.

What if an issue isn't the application itself, but a specific user?

What if there were previous outages for this application?

Monitoring solutions, after being implemented, can provide data with accurate context, automatically, so you can view your applications in the scope of everything else going on.

How can a monitoring / observability solution enable intelligent decision-making? How do we make it so the right people get the right data and make the best decisions they can?

These are the questions to be asking and the real challenges to solve for. A modern monitoring solution can help answer these questions when they offer:
- Real-time automation
- Automation of configuration
- Data within context
- A machine learning engine that improves and delivers data to all other AIOps platforms too

Legacy monitoring solutions have led organizations astray, thinking they can save time, effort, and cost by not implementing APM into cloud-native architectures. But modern monitoring solutions were designed for these modern environments and are the actual best way in which organizations can save time, effort and money, while empowering the entire IT team.

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

Skipping Application Monitoring is the Biggest Anti-Pattern in Application Observability

Chris Farrell

Anti-patterns involve realizing a problem and implementing a non-optimal solution that is broadly embraced as the go-to method for solving that problem. This solution sounds good in theory, but for one reason or another it is not the best means of solving the problem.

A common example of this involves gasoline and rising prices. As prices go up, consumers tend to avoid getting gas as long as possible, until they are running on fumes. In reality, the best way to save money during this time would be to fill up your tank every chance you get.

Anti-patterns are common across IT as well, especially around application monitoring and observability. One that is particularly prevalent is in response to the increasing complexity of cloud-native infrastructure and applications. The [suboptimal] idea is that the best way to monitor modern applications is to not install monitoring, but rather have developers manually code in their own monitoring capabilities, put all the data into logs, and solve problems by analyzing custom dashboards and the resulting log files.

The reality is that this concept tends to lead to a multitude of visibility gaps, and can even send SWAT teams down the wrong path, depending on what's instrumented, collected and shown. The worst case would be application slow-downs, or even outages, occurring — all while the dashboards show "all systems green."

The problem with anti-patterns is that a popular idea can gain ground, even if the solution is suboptimal. For the afore-mentioned gasoline issue, it might take some math on a napkin to show how a different process can save money. For IT monitoring strategies, it might take a little bit more. To understand when a specific solution or process is an anti-pattern — and how to solve the problem in a more optimal way, it's important to recognize what led to the situation, the ultimate goal, and then open up to different solutions.

What Caused the Application Monitoring Anti-Pattern?

In the case of cloud-native application performance, the problem is that legacy application monitoring tools, which require continuous configuration and even some manual coding to reach their full value proposition, can lead to slow-downs in the DevOps and continuous integration / continuous deployment (CI/CD) process by requiring reconfiguration every time an update is released. There's always a chance that if the new reconfiguration isn't done (and done right), that the tool will not have the right data to either recognize a problem or solve it.

This is what has led many to eschew the idea of a monitoring tool and, instead, have their developers instrument monitoring into the code and simply analyze everything in logs themselves. Ultimately, they recognize the time consuming and menial work log analysis is, but it's seen as the lesser of two evils when compared to constant reconfiguration of monitoring.

But this isn't exactly optimal, itself. If the developers don't capture the right information at the right time, then the log analysis strategy is just as iffy as an unconfigured APM tool. Meanwhile, the only way to understand how any two pieces fit together is to bring the entire team into the analysis phase, which probably means even bigger bridge calls than with just the APM swat team approach.

Finding A Better Solution

As with any anti-pattern, including our real-world example above, the way to find an optimal solution is to start with the goal and make sure you're working towards that goal. In the gasoline example, people generally equate less frequent purchases as spending less, but if they instead focus on the actual cost itself, they can recognize an alternative that better achieves their goal of minimizing costs.

The same is true in application monitoring. The goal is to get the most immediate feedback on any software update, to proactively understand when a problem is occurring and easily, and quickly, solve the problem.

IT teams know that they want:

■ Monitoring up and down the cloud-native stack

■ Understanding within monitoring when changes occur

■ Access to data (and understanding) from a broader set of stakeholders

Certainly, the idea of developers coding, monitoring, and tracing, coupled with direct log analysis by every stakeholder, meets the above — but does it truly achieve the ultimate goals of Dev+Ops when it comes to operating their applications?

Let's tackle the problems and misconceptions of this observability anti-pattern:

Configuring monitoring is hard — no one wants to spend the time or investment needed to even get going with a monitoring tool.

We agree, it can be hard. But there are monitoring and observability solutions that automate the hard part (we promise, they exist). You shouldn't avoid the idea of monitoring because of the traditional hurdles involved in setting this up.

We can provide data for everyone to use! No observability tool needed. What does providing a firehose of all data to all users create? A lot of time wasting, inefficiency, and non-focused analysis.

The problem here is: If you provide all the data to a user, it will take forever to sort through what is relevant to them. Or, if you provide only the specific data related to the application they care about for example, they won't have the context needed to fully understand the situation.

What if an issue isn't the application itself, but a specific user?

What if there were previous outages for this application?

Monitoring solutions, after being implemented, can provide data with accurate context, automatically, so you can view your applications in the scope of everything else going on.

How can a monitoring / observability solution enable intelligent decision-making? How do we make it so the right people get the right data and make the best decisions they can?

These are the questions to be asking and the real challenges to solve for. A modern monitoring solution can help answer these questions when they offer:
- Real-time automation
- Automation of configuration
- Data within context
- A machine learning engine that improves and delivers data to all other AIOps platforms too

Legacy monitoring solutions have led organizations astray, thinking they can save time, effort, and cost by not implementing APM into cloud-native architectures. But modern monitoring solutions were designed for these modern environments and are the actual best way in which organizations can save time, effort and money, while empowering the entire IT team.

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...