APM vs Monitoring in Cloud-Native Environments: Reject the False Dichotomy
August 24, 2018

Apurva Davé
Sysdig

Share this

Ask anyone who's managed software in production: Management tools have many useful attributes, but no single tool gives you everything you need. Oh sure, a new interface comes along and handles an emerging use case beautifully – for a while. But requirements inevitably change and new variables get added to the equation. You add, upgrade or increase the complexity as needed.

This is a familiar arc for developers, IT pros and anyone who manages applications and their underlying infrastructure. And the story is no different when you look at observability tools like application performance management (APM).

For DevOps professionals, the advent of cloud-native systems and X-as-a-service has exposed the limitations of traditional APM tools. Most APM tools were designed to instrument and visualize simpler, static monoliths, and focused on the application layer to visualize traces of individual transactions. The fact is, APM is still sorely needed for developers, but it is not a panacea when it comes to understanding the overall performance of your application.

With cloud native computing, you may have dozens of microservices and hundreds or thousands of short-lived containers spread across multiple clouds. The efficiency of microservices is great for developer agility, but microservice architectures have also complicated the job of the operations team to ensure the performance, uptime and security of their systems.

In this new world, DevOps is finding it needs a broader range of functionality to truly understand system performance and potential issues. That functionality includes:

■ Collection of high frequency, high cardinality metrics across all containers, applications, and microservices. This data is typically stored over a long time to enable trending, yet is becoming more complex in today’s systems

■ Correlation of metrics with events (like a Kubernetes scaling event or a code push)

■ Capture of deep troubleshooting information like logs or system calls to derive a root cause issue in both the application and/or the infrastructure

■ Tracing key transactions through the call stack

A New Breed of Monitoring

With this broad range of requirements, it is easy to see that one system is unlikely to serve all of these needs well. And that has led to wider adoption of a new breed of cloud-native IT infrastructure monitoring (ITIM), a device- or capability-oriented approach that focuses on drawing a link between your applications, microservices, and the underlying infrastructure.

According to Gregg Siegfried from Gartner, "IT Infrastructure monitoring has always been difficult to do well. Cloud platforms, containers and changing software architecture have only increased the challenges." (Gartner, "Monitoring Modern Services and Infrastructure" by Gregg Siegfried on 15 March 2018)

Cloud-native systems have radically increased the need for dynamic metric systems. In addition, organizations that need high-volume, high cardinality metrics (think Facebook or Netflix) used to be the exception, but they are now becoming commonplace across enterprises of all sizes. APM by itself can't meet the needs of these new systems.

As a result, organizations are adopting APM and ITIM alongside each other. Critical management criteria align with different monitoring tools. Performance metrics are associated with ITIM; tracing is aligned with APM; logging is part of incident and event management. While there is some overlap, if we look at their core functionality there is far more differentiation than repetition.

APM typically works with heavyweight instrumentation inside your application code, giving you a detailed look at how the code written by your developers is performing. That’s extremely valuable, especially when developers are debugging their code in test before it goes into production. Unfortunately, APM also abstracts away the underlying containers, hosts, and network infrastructure. That's not an issue for developers since they only need to worry about the code they wrote, but operations professionals must consider the entire stack, and have something resource-efficient enough to actually deploy across everything in production.

In contrast, a modern, cloud-native ITIM monitoring system doesn’t instrument your code. But it will give you system visibility by instrumenting all the hosts in your environment and give you visibility into networks (physical and software-defined), as well as hosts, containers, processes, base application metrics, and developer-provided custom metrics like Prometheus, statsd and JMX.

Scale is also a very different challenge for any implementation using ITIM. APM was not designed for high frequency, high cardinality, multi-dimensional metrics, but modern ITIM was conceived with massive scale and a need to recompute metrics on the fly based on high cardinality metadata. Your ITIM tool should enable you to store all the metrics in a raw form, and recompute the answers to questions on the fly - an essential.

With this rich functionality, cloud-native ITIM monitoring systems give you a powerful view of overall system performance, especially where your applications are interacting with underlying systems.

But again, for most organizations this isn't an either-or situation. You might eliminate your APM tool if you have absolute faith nothing will ever go wrong with your application code. Or if you're extremely confident your infrastructure, container, and orchestration tooling will always perform as expected. But most DevOps professionals will see through this false dichotomy and use some combination of these tools to ensure performance, reliability and security. And if your organization is focused on the fastest mean time to resolution (MTTR) as a performance metric, it's best to have both systems in place.

Apurva Davé is VP of Marketing at Sysdig
Share this

The Latest

January 17, 2019

APMdigest invited industry experts to predict how Cloud will evolve and impact application performance and business in 2019. Part 3, the final installment, covers monitoring and managing application performance in the Cloud ...

January 16, 2019

APMdigest invited industry experts to predict how Cloud will evolve and impact application performance and business in 2019. Part 2 covers multi-cloud, hybrid cloud, serverless and more ...

January 15, 2019

As a continuation of the list of 2019 predictions, APMdigest invited industry experts to predict how Cloud will evolve and impact application performance and business in 2019 ...

January 14, 2019

APMdigest invited industry experts to predict how Network Performance Management (NPM) and related technologies will evolve and impact business in 2019 ...

January 11, 2019

I would like to highlight some of the predictions made at the start of 2018, and how those have panned out, or not actually occurred. I will review some of the predictions and trends from APMdigest's 2018 APM Predictions. Here is Part 2 ...

January 10, 2019

I would like to highlight some of the predictions made at the start of 2018, and how those have panned out, or not actually occurred. I will review some of the predictions and trends from APMdigest's 2018 APM Predictions ...

January 09, 2019

I sat down with Stephen Elliot, VP of Management Software and DevOps at IDC, to discuss where the market is headed, how legacy vendors will need to adapt, and how customers can get ahead of these trends to gain a competitive advantage. Part 2 of the interview ...

January 08, 2019

Monitoring and observability requirements are continuing to adapt to the rapid advances in public cloud, containers, serverless, microservices, and DevOps and CI/CD practices. As new technology and development processes become mainstream, enterprise adoption begins to increase, bringing its own set of security, scalability, and manageability needs. I sat down with Stephen Elliot, VP of Management Software and DevOps at IDC, to discuss where the market is headed, how legacy vendors will need to adapt, and how customers can get ahead of these trends to gain a competitive advantage ...

December 20, 2018

APMdigest invited industry experts to predict how APM and related technologies will evolve and impact business in 2019. Part 6 covers the Internet of Things (IoT) ...

December 19, 2018

APMdigest invited industry experts to predict how APM and related technologies will evolve and impact business in 2019. Part 5 covers the evolution of ITOA and its impact on the IT team ...