Skip to main content

APM and Observability: Cutting Through the Confusion — Part 4

Pete Goldin
APMdigest

Observability truly offers a wealth of capabilities that reach far beyond what we traditionally expect from APM, according to Arun Balachandran, Senior Product Marketing Manager, ManageEngine APM Solutions.

While APM excels at meticulously tracking application metrics and promptly alerting us when things go awry, observability empowers our teams to delve much deeper, he continues. It's about enabling us to truly get to the bottom of things, correlating data across logs, traces, and metrics to more easily investigate those unexpected or intricate issues that inevitably arise. This more exploratory approach helps us uncover hidden dependencies, pinpoint subtle performance bottlenecks, and identify potential failure points that might never surface through standard monitoring.

Start with: APM and Observability - Cutting Through the Confusion - Part 3

"APM is still important but it's not enough," adds Gurjeet Arora, CEO and Co-Founder of Observo AI.

The following is a list of several ways experts see Observability surpassing APM or offering capabilities beyond APM:

Broader Coverage

Observability platforms typically provide a much wider lens, encompassing infrastructure, network layers, and even third-party services. This gives us a richer context and a far more complete understanding of how our systems are truly behaving.
Arun Balachandran
Senior Product Marketing Manager, ManageEngine APM Solutions

Observability goes far beyond APM to cover other domains of computing and networking, from routing and switching systems, to compute and storage server infrastructure, to edge systems and endpoints, to orchestration layers and operating systems, to microservices, API gateways, hypervisors and microkernels, to databases and even the data itself.
Peter Corless
Director, Product Marketing, StarTree

Observability should provide a much deeper understanding of system performance within the context of business services and the broader IT environment. Ideally, it should see everything, from cloud-native microservices, containers and VMs, to physical devices and hardware like network devices and laptops.
Douglas James
VP, Solutions & Ecosystem, ScienceLogic

Richer Data Stack

APM generally focuses on specific or predefined metrics for the application layer. It is great at tracking predefined thresholds for known problem areas to ensure specific SLAs and health. Observability can do all that; however, by using a much richer dataset from all aspects of the tech stack, it can paint a larger picture and support much richer root cause analysis.
Sven Delmas
VP of Research, Mezmo

Observability also provides system-wide context by aggregating data from a variety of sources, including servers, containers, databases, and cloud services. This breadth of visibility extends beyond the application runtime, giving teams a holistic view of system behavior and performance.
Gurjeet Arora
CEO and Co-Founder, Observo AI

Comprehensive View of a System's Internal State

While often used interchangeably, observability goes further than APM. It provides a comprehensive view of a system's internal state through extensive data collection — metrics, logs, and traces — across distributed systems.
Varma Kunaparaju
SVP and GM for Cloud Platform and OpsRamp Software, HPE

Application Performance Monitoring (APM), has a narrower focus, concentrating on user experience (UX) by tracking metrics like latency, error rates, and throughput. It's valuable for surfacing known issues through dashboards and alerts, particularly in traditional environments. Observability, on the other hand, is about the practice of understanding a system's internal state based on its external outputs. This broader approach enables teams to investigate, analyze, and resolve issues even when the root cause is not immediately clear. Observability includes multiple tools and technologies such as application monitoring, log management, tracing, telemetry pipelines, and anomaly detection. Where APM might indicate that latency has spiked, observability helps explain why, offering a deeper view into the system's health and behaviors across the stack.
Ajay Khanna
CMO, Yugabyte

Deeper Insights into Complex Distributed Systems

APM has traditionally been rooted in performance monitoring, with a clear focus on application-level metrics and diagnostics, while observability represents a more expansive and adaptive approach that is aimed at providing deeper insights into complex, distributed systems.
Arun Balachandran
Senior Product Marketing Manager, ManageEngine APM Solutions

The killer innovation of the earliest APM tools was really in design. APM tools were the first tools to bring opinionated workflows to working with production monitoring data. But as systems become more complex, it's necessary to expand beyond the capabilities of traditional APM tools. The core APM workflows are still an incredible starting point for system understanding, as long as we don't mistake them for full observability, which can address a breadth of use cases where traditional monitoring falls short. We may even be able to re-use or re-implement these core workflows in observability tools in many cases.
Emily Nakashima
VP of Engineering, Honeycomb

Understanding How Everything Connects

Observability provides a holistic view across the entire infrastructure stack, enabling teams to understand complex interactions between components that might not be visible through an APM-only lens.
Paul Appleby
CEO, Virtana

APM is a subset of observability. It provides valuable insights into application behavior, typically through metrics and transaction traces. But observability not only encompasses performance, it includes understanding availability, dependencies, infrastructure health, security signals, and system-wide behavior. It's not just about measuring how fast something is; it's about making sense of how everything connects.
Gurjeet Arora
CEO and Co-Founder, Observo AI

Identifying Unknown Unknowns

Observability isn't just APM with more data types, it's about enabling teams to ask arbitrary questions about system behavior, not just track predefined KPIs. It's a shift from monitoring known issues to exploring unknown unknowns. True observability requires a mindset (and architecture) focused on correlation, context, and flexibility, rather than dashboards alone.
Brian Douglas
Head of Ecosystem, Cloud Native Computing Foundation (CNCF)

While both disciplines rely on telemetry harvested from systems, they address different needs; APM typically focuses on answering pre-defined questions about application health using curated views, whereas observability equips teams to explore the unknown and ask questions they hadn't anticipated. Think of it as knowing the difference between tending to a specific plant based on its known needs versus analyzing the entire garden's soil health to understand unexpected issues.

Observability extends beyond APM by empowering sophisticated users to investigate unpredicted scenarios within complex systems. Where APM often provides answers to questions we already know to ask (like "What is the latency of this service?" and "What are my slowest DB queries?" observability provides the tools — often rich query languages and flexible visualization platforms — to explore unanticipated behavior, debug novel failure modes, and understand emergent properties of the system as a whole. It's about having the capability to diagnose why a specific section of your garden is behaving unexpectedly, rather than just monitoring the known growth patterns of individual plants.
Juraci Paixão Kröhling
Software Engineer, OllyGarden

The key element that observability should provide is insights into data that help teams identify the unknown unknowns. This is what "should" make observability an upgrade over APM. These concepts matter because unknown failure states are the hardest element of traditional monitoring. For example, traditional monitoring asks how I define failure so as a first step, I will create alerts to look for how I define failure. The alert will fire, and teams will respond. The big issue that observability helps with is how to find failure states I do not know about. When an unknown failure state occurs, a traditional monitor will miss the problem and teams will be late to respond, if at all. The damage from the outage is magnified because of the delay, which centers around the inability to detect unknown failure states.
Ed Bailey
Field CISO, Cribl

High-Cardinality Filtering

Modern observability practices support high-cardinality filtering, which allows teams to minimize noise from unqueried metrics or overly granular tagging. This level of filtering goes beyond the capabilities of most traditional APM tools, enabling more efficient and meaningful analysis.
Gurjeet Arora
CEO and Co-Founder, Observo AI

Getting to the Root Cause of Performance Issues

Application-level performance is a critical lens, especially for user-facing services, but it's only one part of the larger story. Observability enables teams to go beyond metrics and get to the root cause of performance issues by correlating logs, traces, and other machine data across systems. In modern, distributed environments, observability is essential to understanding the context of what's going wrong, not just that something is.
Gurjeet Arora
CEO and Co-Founder, Observo AI

APM and observability serve distinct but complementary roles. APM focuses on ensuring application performance by monitoring key metrics to quickly detect and resolve issues, ensuring a smooth UX. Observability also supports deeper capabilities, like performance troubleshooting, where root cause analysis may rely on fine-grained metrics or trace data unavailable in a traditional APM tool. It supports fine-grained monitoring, such as tracking specific components or behaviors, and adds additional context, which is especially useful for resolving complex, systemic issues or ensuring stability during critical changes like upgrades or migrations. While APM is focused on performance at the application level, observability offers broader, more detailed visibility across the entire system.
Ajay Khanna
CMO, Yugabyte

Delivering Actionable Insights

Observability helps to answer system-wide questions such as, "What's the impact of any state/change/issue on the business in terms of availability, performance, costs, security, and compliance?". Observability answers these questions by turning telemetry into something users can act on, ultimately improving overall system behavior, operational awareness and excellence across teams.
Hugo Kaczmarek
Director of Product, APM Suite, Datadog

Observability transcends mere detection, offering a deeper understanding of the 'why' behind system behaviors. By integrating data from diverse sources, observability facilitates rapid root cause analysis, enhances team collaboration, and empowers organizations to foresee and address potential issues before they affect the customer experience. It's about transforming data into actionable insights. With the addition of GenAI, observability can dynamically orchestrate the collection, analysis, and action phases of IT operations, further enhancing its capabilities.
Gab Menachem
VP ITOM, ServiceNow

Go to: APM and Observability: Cutting Through the Confusion — Part 5, for more insight in how Observability evolved from APM.

Pete Goldin is Editor and Publisher of APMdigest

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

APM and Observability: Cutting Through the Confusion — Part 4

Pete Goldin
APMdigest

Observability truly offers a wealth of capabilities that reach far beyond what we traditionally expect from APM, according to Arun Balachandran, Senior Product Marketing Manager, ManageEngine APM Solutions.

While APM excels at meticulously tracking application metrics and promptly alerting us when things go awry, observability empowers our teams to delve much deeper, he continues. It's about enabling us to truly get to the bottom of things, correlating data across logs, traces, and metrics to more easily investigate those unexpected or intricate issues that inevitably arise. This more exploratory approach helps us uncover hidden dependencies, pinpoint subtle performance bottlenecks, and identify potential failure points that might never surface through standard monitoring.

Start with: APM and Observability - Cutting Through the Confusion - Part 3

"APM is still important but it's not enough," adds Gurjeet Arora, CEO and Co-Founder of Observo AI.

The following is a list of several ways experts see Observability surpassing APM or offering capabilities beyond APM:

Broader Coverage

Observability platforms typically provide a much wider lens, encompassing infrastructure, network layers, and even third-party services. This gives us a richer context and a far more complete understanding of how our systems are truly behaving.
Arun Balachandran
Senior Product Marketing Manager, ManageEngine APM Solutions

Observability goes far beyond APM to cover other domains of computing and networking, from routing and switching systems, to compute and storage server infrastructure, to edge systems and endpoints, to orchestration layers and operating systems, to microservices, API gateways, hypervisors and microkernels, to databases and even the data itself.
Peter Corless
Director, Product Marketing, StarTree

Observability should provide a much deeper understanding of system performance within the context of business services and the broader IT environment. Ideally, it should see everything, from cloud-native microservices, containers and VMs, to physical devices and hardware like network devices and laptops.
Douglas James
VP, Solutions & Ecosystem, ScienceLogic

Richer Data Stack

APM generally focuses on specific or predefined metrics for the application layer. It is great at tracking predefined thresholds for known problem areas to ensure specific SLAs and health. Observability can do all that; however, by using a much richer dataset from all aspects of the tech stack, it can paint a larger picture and support much richer root cause analysis.
Sven Delmas
VP of Research, Mezmo

Observability also provides system-wide context by aggregating data from a variety of sources, including servers, containers, databases, and cloud services. This breadth of visibility extends beyond the application runtime, giving teams a holistic view of system behavior and performance.
Gurjeet Arora
CEO and Co-Founder, Observo AI

Comprehensive View of a System's Internal State

While often used interchangeably, observability goes further than APM. It provides a comprehensive view of a system's internal state through extensive data collection — metrics, logs, and traces — across distributed systems.
Varma Kunaparaju
SVP and GM for Cloud Platform and OpsRamp Software, HPE

Application Performance Monitoring (APM), has a narrower focus, concentrating on user experience (UX) by tracking metrics like latency, error rates, and throughput. It's valuable for surfacing known issues through dashboards and alerts, particularly in traditional environments. Observability, on the other hand, is about the practice of understanding a system's internal state based on its external outputs. This broader approach enables teams to investigate, analyze, and resolve issues even when the root cause is not immediately clear. Observability includes multiple tools and technologies such as application monitoring, log management, tracing, telemetry pipelines, and anomaly detection. Where APM might indicate that latency has spiked, observability helps explain why, offering a deeper view into the system's health and behaviors across the stack.
Ajay Khanna
CMO, Yugabyte

Deeper Insights into Complex Distributed Systems

APM has traditionally been rooted in performance monitoring, with a clear focus on application-level metrics and diagnostics, while observability represents a more expansive and adaptive approach that is aimed at providing deeper insights into complex, distributed systems.
Arun Balachandran
Senior Product Marketing Manager, ManageEngine APM Solutions

The killer innovation of the earliest APM tools was really in design. APM tools were the first tools to bring opinionated workflows to working with production monitoring data. But as systems become more complex, it's necessary to expand beyond the capabilities of traditional APM tools. The core APM workflows are still an incredible starting point for system understanding, as long as we don't mistake them for full observability, which can address a breadth of use cases where traditional monitoring falls short. We may even be able to re-use or re-implement these core workflows in observability tools in many cases.
Emily Nakashima
VP of Engineering, Honeycomb

Understanding How Everything Connects

Observability provides a holistic view across the entire infrastructure stack, enabling teams to understand complex interactions between components that might not be visible through an APM-only lens.
Paul Appleby
CEO, Virtana

APM is a subset of observability. It provides valuable insights into application behavior, typically through metrics and transaction traces. But observability not only encompasses performance, it includes understanding availability, dependencies, infrastructure health, security signals, and system-wide behavior. It's not just about measuring how fast something is; it's about making sense of how everything connects.
Gurjeet Arora
CEO and Co-Founder, Observo AI

Identifying Unknown Unknowns

Observability isn't just APM with more data types, it's about enabling teams to ask arbitrary questions about system behavior, not just track predefined KPIs. It's a shift from monitoring known issues to exploring unknown unknowns. True observability requires a mindset (and architecture) focused on correlation, context, and flexibility, rather than dashboards alone.
Brian Douglas
Head of Ecosystem, Cloud Native Computing Foundation (CNCF)

While both disciplines rely on telemetry harvested from systems, they address different needs; APM typically focuses on answering pre-defined questions about application health using curated views, whereas observability equips teams to explore the unknown and ask questions they hadn't anticipated. Think of it as knowing the difference between tending to a specific plant based on its known needs versus analyzing the entire garden's soil health to understand unexpected issues.

Observability extends beyond APM by empowering sophisticated users to investigate unpredicted scenarios within complex systems. Where APM often provides answers to questions we already know to ask (like "What is the latency of this service?" and "What are my slowest DB queries?" observability provides the tools — often rich query languages and flexible visualization platforms — to explore unanticipated behavior, debug novel failure modes, and understand emergent properties of the system as a whole. It's about having the capability to diagnose why a specific section of your garden is behaving unexpectedly, rather than just monitoring the known growth patterns of individual plants.
Juraci Paixão Kröhling
Software Engineer, OllyGarden

The key element that observability should provide is insights into data that help teams identify the unknown unknowns. This is what "should" make observability an upgrade over APM. These concepts matter because unknown failure states are the hardest element of traditional monitoring. For example, traditional monitoring asks how I define failure so as a first step, I will create alerts to look for how I define failure. The alert will fire, and teams will respond. The big issue that observability helps with is how to find failure states I do not know about. When an unknown failure state occurs, a traditional monitor will miss the problem and teams will be late to respond, if at all. The damage from the outage is magnified because of the delay, which centers around the inability to detect unknown failure states.
Ed Bailey
Field CISO, Cribl

High-Cardinality Filtering

Modern observability practices support high-cardinality filtering, which allows teams to minimize noise from unqueried metrics or overly granular tagging. This level of filtering goes beyond the capabilities of most traditional APM tools, enabling more efficient and meaningful analysis.
Gurjeet Arora
CEO and Co-Founder, Observo AI

Getting to the Root Cause of Performance Issues

Application-level performance is a critical lens, especially for user-facing services, but it's only one part of the larger story. Observability enables teams to go beyond metrics and get to the root cause of performance issues by correlating logs, traces, and other machine data across systems. In modern, distributed environments, observability is essential to understanding the context of what's going wrong, not just that something is.
Gurjeet Arora
CEO and Co-Founder, Observo AI

APM and observability serve distinct but complementary roles. APM focuses on ensuring application performance by monitoring key metrics to quickly detect and resolve issues, ensuring a smooth UX. Observability also supports deeper capabilities, like performance troubleshooting, where root cause analysis may rely on fine-grained metrics or trace data unavailable in a traditional APM tool. It supports fine-grained monitoring, such as tracking specific components or behaviors, and adds additional context, which is especially useful for resolving complex, systemic issues or ensuring stability during critical changes like upgrades or migrations. While APM is focused on performance at the application level, observability offers broader, more detailed visibility across the entire system.
Ajay Khanna
CMO, Yugabyte

Delivering Actionable Insights

Observability helps to answer system-wide questions such as, "What's the impact of any state/change/issue on the business in terms of availability, performance, costs, security, and compliance?". Observability answers these questions by turning telemetry into something users can act on, ultimately improving overall system behavior, operational awareness and excellence across teams.
Hugo Kaczmarek
Director of Product, APM Suite, Datadog

Observability transcends mere detection, offering a deeper understanding of the 'why' behind system behaviors. By integrating data from diverse sources, observability facilitates rapid root cause analysis, enhances team collaboration, and empowers organizations to foresee and address potential issues before they affect the customer experience. It's about transforming data into actionable insights. With the addition of GenAI, observability can dynamically orchestrate the collection, analysis, and action phases of IT operations, further enhancing its capabilities.
Gab Menachem
VP ITOM, ServiceNow

Go to: APM and Observability: Cutting Through the Confusion — Part 5, for more insight in how Observability evolved from APM.

Pete Goldin is Editor and Publisher of APMdigest

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...