Skip to main content

APM and Observability: Cutting Through the Confusion — Part 9

Pete Goldin
APMdigest

The story of the evolution of Observability to encompass APM and other IT performance management capabilities would not be complete without discussing the monumental impact of open source.

Start with: APM and Observability - Cutting Through the Confusion - Part 8

Open source is transforming how organizations approach APM and observability by providing vendor neutral standards for collecting and exporting telemetry types, says Mimi Shalash, Observability Advisor at Splunk, a Cisco Company.

Solutions like OpenTelemetry simplify integration across platforms, reduce vendor lock-in, and improve interoperability in complex environments, Shalash continues. Prometheus enhances this approach with robust metrics and alerting, especially systems like Kubernetes. And together these tools enable flexible, cost-effective stacks designed to scale and evolve with modern infrastructure.

“Open source tools like OpenTelemetry and Prometheus are becoming essential building blocks for observability in modern, cloud-native environments,” explains Andreas Grabner, Fellow DevRel and CNCF Ambassador, Dynatrace. “They empower organizations with greater flexibility and standardization in how telemetry data is collected. The broader industry trend is moving toward interoperability and data unification—using open standards for collection while relying on more advanced platforms to contextualize, analyze and act on that data at scale. This hybrid model allows teams to preserve their existing investments in open source while benefiting from automation, AI and enterprise grade observability.”

“The observability space is a prime target for OSS,” Sven Delmas, VP of Research at Mezmo, agrees. “Between dealing with a tech-savvy and curious audience, constant pressure on cost control, and the need for transparency and avoiding vendor lock-in, there has been — and will be — an ever-increasing push to OSS.”

Driving Observability's Evolution

Open source is changing the center of gravity in observability from tools to telemetry, according to Brian Douglas, Head of Ecosystem, Cloud Native Computing Foundation (CNCF). Developers are adopting Prometheus, OpenTelemetry, and Fluent Bit not just because they're free or flexible, but because they represent an open, portable foundation. These tools make it easier to switch vendors, build internal platforms, and innovate on top of shared standards. They're not just part of the observability conversation; they're shaping the future of how observability is defined.

APM is one specific implementation of observability, not its full scope, Douglas continues. It answers questions like, 'Is this app performing within expected parameters?' Observability, in contrast, supports deeper exploration: 'Why did latency spike in a downstream service for certain regions?' Projects like Prometheus and OpenTelemetry enable this broader context by collecting high-dimensional metrics, distributed traces, and logs which gives teams the raw, interoperable data needed to connect the dots.”

Observability supports cross-signal correlation and open-ended investigation, Douglas adds. Rather than focusing solely on applications, it lets teams visualize the full stack, from container runtimes and infrastructure to network topology and business-level SLIs.

  • Prometheus provides robust, flexible metrics, while Cortex scales them across environments.
  • Fluent Bit and Fluentd handle log aggregation and routing across edge and core environments.
  • OpenTelemetry standardizes telemetry collection and enriches it with context, making it easier for tools and teams to interoperate without reinventing the wheel.

“What's important is interoperability,” Douglas of CNCF explains. “With standards like OpenTelemetry and protocols like the Prometheus exposition format, teams can adopt a modular approach: instrument once, analyze anywhere. This lets them use best-in-class components rather than be locked into a monolithic solution. Observability isn't a single tool, it's a strategy backed by open, composable tooling.”

With OpenTelemetry, users can build a composable observability stack where each tool plays to its strengths: one might excel at exploratory debugging, another at automated root cause analysis, and a third at cost-effective long-term storage, Severin Neumann, Head of Community & Developer Relations at Causely, elaborates. This flexibility lets teams get the best outcomes for their specific needs without duplicating instrumentation or locking themselves into a one-size-fits-all solution.

OpenTelemetry: Reshaping APM and Observability

OpenTelemetry, in particular, has already profoundly reshaped the APM market and the broader observability field, explains Juraci Paixão Kröhling, Software Engineer at OllyGarden. “Initially, some established players might have overlooked it, but strong customer demand has made OpenTelemetry support almost table stakes now; it's rare to find a vendor unable to ingest the standard OTLP format.”

OpenTelemetry is an open source standard, framework and suite of tools facilitating the generation, collection, and exporting of telemetry data.

“OpenTelemetry is having a huge impact on the industry with studies showing that nearly half of organizations polled are using OpenTelemetry with another 25-percent-plus looking to adopt in the near term,” says Harald Burose, Director, Product Management, Research & Development – Engineering, OpenText.

Download the EMA Report: Taking Observability to the Next Level - OpenTelemetry’s Emerging Role in IT Performance and Reliability

Kröhling from OllyGarden continues, “I expect vendors will increasingly embrace OpenTelemetry more natively, treating its semantic conventions not just as data points but as first-class citizens for richer understanding. The era of requiring proprietary agents for basic data collection is closing; customers now expect tools not only to handle open formats but to do so meaningfully, respecting the common language defined by standards like OpenTelemetry. This shared foundation allows everyone to cultivate better systems.”

OpenTelemetry provides teams with greater flexibility, standardization, and control over their telemetry data and has become the de facto standard for data ingestion, according to Bahubali Shetti, Senior Director, Product Marketing, Elastic. Whether deployments use standard OTel SDKs, auto-instrumentation, OTel Collectors, or a combination of these, users can avoid vendor lock-in and reduce the need for future retooling.

“OpenTelemetry isn't just shaping the future of observability, it's quickly becoming the standard that modern, scalable systems are built on,” concludes Shalash from Splunk.

Go to: APM and Observability: Cutting Through the Confusion — Part 10, discussing AI's impact on APM and Observability.

Pete Goldin is Editor and Publisher of APMdigest

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

APM and Observability: Cutting Through the Confusion — Part 9

Pete Goldin
APMdigest

The story of the evolution of Observability to encompass APM and other IT performance management capabilities would not be complete without discussing the monumental impact of open source.

Start with: APM and Observability - Cutting Through the Confusion - Part 8

Open source is transforming how organizations approach APM and observability by providing vendor neutral standards for collecting and exporting telemetry types, says Mimi Shalash, Observability Advisor at Splunk, a Cisco Company.

Solutions like OpenTelemetry simplify integration across platforms, reduce vendor lock-in, and improve interoperability in complex environments, Shalash continues. Prometheus enhances this approach with robust metrics and alerting, especially systems like Kubernetes. And together these tools enable flexible, cost-effective stacks designed to scale and evolve with modern infrastructure.

“Open source tools like OpenTelemetry and Prometheus are becoming essential building blocks for observability in modern, cloud-native environments,” explains Andreas Grabner, Fellow DevRel and CNCF Ambassador, Dynatrace. “They empower organizations with greater flexibility and standardization in how telemetry data is collected. The broader industry trend is moving toward interoperability and data unification—using open standards for collection while relying on more advanced platforms to contextualize, analyze and act on that data at scale. This hybrid model allows teams to preserve their existing investments in open source while benefiting from automation, AI and enterprise grade observability.”

“The observability space is a prime target for OSS,” Sven Delmas, VP of Research at Mezmo, agrees. “Between dealing with a tech-savvy and curious audience, constant pressure on cost control, and the need for transparency and avoiding vendor lock-in, there has been — and will be — an ever-increasing push to OSS.”

Driving Observability's Evolution

Open source is changing the center of gravity in observability from tools to telemetry, according to Brian Douglas, Head of Ecosystem, Cloud Native Computing Foundation (CNCF). Developers are adopting Prometheus, OpenTelemetry, and Fluent Bit not just because they're free or flexible, but because they represent an open, portable foundation. These tools make it easier to switch vendors, build internal platforms, and innovate on top of shared standards. They're not just part of the observability conversation; they're shaping the future of how observability is defined.

APM is one specific implementation of observability, not its full scope, Douglas continues. It answers questions like, 'Is this app performing within expected parameters?' Observability, in contrast, supports deeper exploration: 'Why did latency spike in a downstream service for certain regions?' Projects like Prometheus and OpenTelemetry enable this broader context by collecting high-dimensional metrics, distributed traces, and logs which gives teams the raw, interoperable data needed to connect the dots.”

Observability supports cross-signal correlation and open-ended investigation, Douglas adds. Rather than focusing solely on applications, it lets teams visualize the full stack, from container runtimes and infrastructure to network topology and business-level SLIs.

  • Prometheus provides robust, flexible metrics, while Cortex scales them across environments.
  • Fluent Bit and Fluentd handle log aggregation and routing across edge and core environments.
  • OpenTelemetry standardizes telemetry collection and enriches it with context, making it easier for tools and teams to interoperate without reinventing the wheel.

“What's important is interoperability,” Douglas of CNCF explains. “With standards like OpenTelemetry and protocols like the Prometheus exposition format, teams can adopt a modular approach: instrument once, analyze anywhere. This lets them use best-in-class components rather than be locked into a monolithic solution. Observability isn't a single tool, it's a strategy backed by open, composable tooling.”

With OpenTelemetry, users can build a composable observability stack where each tool plays to its strengths: one might excel at exploratory debugging, another at automated root cause analysis, and a third at cost-effective long-term storage, Severin Neumann, Head of Community & Developer Relations at Causely, elaborates. This flexibility lets teams get the best outcomes for their specific needs without duplicating instrumentation or locking themselves into a one-size-fits-all solution.

OpenTelemetry: Reshaping APM and Observability

OpenTelemetry, in particular, has already profoundly reshaped the APM market and the broader observability field, explains Juraci Paixão Kröhling, Software Engineer at OllyGarden. “Initially, some established players might have overlooked it, but strong customer demand has made OpenTelemetry support almost table stakes now; it's rare to find a vendor unable to ingest the standard OTLP format.”

OpenTelemetry is an open source standard, framework and suite of tools facilitating the generation, collection, and exporting of telemetry data.

“OpenTelemetry is having a huge impact on the industry with studies showing that nearly half of organizations polled are using OpenTelemetry with another 25-percent-plus looking to adopt in the near term,” says Harald Burose, Director, Product Management, Research & Development – Engineering, OpenText.

Download the EMA Report: Taking Observability to the Next Level - OpenTelemetry’s Emerging Role in IT Performance and Reliability

Kröhling from OllyGarden continues, “I expect vendors will increasingly embrace OpenTelemetry more natively, treating its semantic conventions not just as data points but as first-class citizens for richer understanding. The era of requiring proprietary agents for basic data collection is closing; customers now expect tools not only to handle open formats but to do so meaningfully, respecting the common language defined by standards like OpenTelemetry. This shared foundation allows everyone to cultivate better systems.”

OpenTelemetry provides teams with greater flexibility, standardization, and control over their telemetry data and has become the de facto standard for data ingestion, according to Bahubali Shetti, Senior Director, Product Marketing, Elastic. Whether deployments use standard OTel SDKs, auto-instrumentation, OTel Collectors, or a combination of these, users can avoid vendor lock-in and reduce the need for future retooling.

“OpenTelemetry isn't just shaping the future of observability, it's quickly becoming the standard that modern, scalable systems are built on,” concludes Shalash from Splunk.

Go to: APM and Observability: Cutting Through the Confusion — Part 10, discussing AI's impact on APM and Observability.

Pete Goldin is Editor and Publisher of APMdigest

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...