Skip to main content

APM and Observability: Cutting Through the Confusion — Part 9

Pete Goldin
APMdigest

The story of the evolution of Observability to encompass APM and other IT performance management capabilities would not be complete without discussing the monumental impact of open source.

Start with: APM and Observability - Cutting Through the Confusion - Part 8

Open source is transforming how organizations approach APM and observability by providing vendor neutral standards for collecting and exporting telemetry types, says Mimi Shalash, Observability Advisor at Splunk, a Cisco Company.

Solutions like OpenTelemetry simplify integration across platforms, reduce vendor lock-in, and improve interoperability in complex environments, Shalash continues. Prometheus enhances this approach with robust metrics and alerting, especially systems like Kubernetes. And together these tools enable flexible, cost-effective stacks designed to scale and evolve with modern infrastructure.

“Open source tools like OpenTelemetry and Prometheus are becoming essential building blocks for observability in modern, cloud-native environments,” explains Andreas Grabner, Fellow DevRel and CNCF Ambassador, Dynatrace. “They empower organizations with greater flexibility and standardization in how telemetry data is collected. The broader industry trend is moving toward interoperability and data unification—using open standards for collection while relying on more advanced platforms to contextualize, analyze and act on that data at scale. This hybrid model allows teams to preserve their existing investments in open source while benefiting from automation, AI and enterprise grade observability.”

“The observability space is a prime target for OSS,” Sven Delmas, VP of Research at Mezmo, agrees. “Between dealing with a tech-savvy and curious audience, constant pressure on cost control, and the need for transparency and avoiding vendor lock-in, there has been — and will be — an ever-increasing push to OSS.”

Driving Observability's Evolution

Open source is changing the center of gravity in observability from tools to telemetry, according to Brian Douglas, Head of Ecosystem, Cloud Native Computing Foundation (CNCF). Developers are adopting Prometheus, OpenTelemetry, and Fluent Bit not just because they're free or flexible, but because they represent an open, portable foundation. These tools make it easier to switch vendors, build internal platforms, and innovate on top of shared standards. They're not just part of the observability conversation; they're shaping the future of how observability is defined.

APM is one specific implementation of observability, not its full scope, Douglas continues. It answers questions like, 'Is this app performing within expected parameters?' Observability, in contrast, supports deeper exploration: 'Why did latency spike in a downstream service for certain regions?' Projects like Prometheus and OpenTelemetry enable this broader context by collecting high-dimensional metrics, distributed traces, and logs which gives teams the raw, interoperable data needed to connect the dots.”

Observability supports cross-signal correlation and open-ended investigation, Douglas adds. Rather than focusing solely on applications, it lets teams visualize the full stack, from container runtimes and infrastructure to network topology and business-level SLIs.

  • Prometheus provides robust, flexible metrics, while Cortex scales them across environments.
  • Fluent Bit and Fluentd handle log aggregation and routing across edge and core environments.
  • OpenTelemetry standardizes telemetry collection and enriches it with context, making it easier for tools and teams to interoperate without reinventing the wheel.

“What's important is interoperability,” Douglas of CNCF explains. “With standards like OpenTelemetry and protocols like the Prometheus exposition format, teams can adopt a modular approach: instrument once, analyze anywhere. This lets them use best-in-class components rather than be locked into a monolithic solution. Observability isn't a single tool, it's a strategy backed by open, composable tooling.”

With OpenTelemetry, users can build a composable observability stack where each tool plays to its strengths: one might excel at exploratory debugging, another at automated root cause analysis, and a third at cost-effective long-term storage, Severin Neumann, Head of Community & Developer Relations at Causely, elaborates. This flexibility lets teams get the best outcomes for their specific needs without duplicating instrumentation or locking themselves into a one-size-fits-all solution.

OpenTelemetry: Reshaping APM and Observability

OpenTelemetry, in particular, has already profoundly reshaped the APM market and the broader observability field, explains Juraci Paixão Kröhling, Software Engineer at OllyGarden. “Initially, some established players might have overlooked it, but strong customer demand has made OpenTelemetry support almost table stakes now; it's rare to find a vendor unable to ingest the standard OTLP format.”

OpenTelemetry is an open source standard, framework and suite of tools facilitating the generation, collection, and exporting of telemetry data.

“OpenTelemetry is having a huge impact on the industry with studies showing that nearly half of organizations polled are using OpenTelemetry with another 25-percent-plus looking to adopt in the near term,” says Harald Burose, Director, Product Management, Research & Development – Engineering, OpenText.

Download the EMA Report: Taking Observability to the Next Level - OpenTelemetry’s Emerging Role in IT Performance and Reliability

Kröhling from OllyGarden continues, “I expect vendors will increasingly embrace OpenTelemetry more natively, treating its semantic conventions not just as data points but as first-class citizens for richer understanding. The era of requiring proprietary agents for basic data collection is closing; customers now expect tools not only to handle open formats but to do so meaningfully, respecting the common language defined by standards like OpenTelemetry. This shared foundation allows everyone to cultivate better systems.”

OpenTelemetry provides teams with greater flexibility, standardization, and control over their telemetry data and has become the de facto standard for data ingestion, according to Bahubali Shetti, Senior Director, Product Marketing, Elastic. Whether deployments use standard OTel SDKs, auto-instrumentation, OTel Collectors, or a combination of these, users can avoid vendor lock-in and reduce the need for future retooling.

“OpenTelemetry isn't just shaping the future of observability, it's quickly becoming the standard that modern, scalable systems are built on,” concludes Shalash from Splunk.

Go to: APM and Observability: Cutting Through the Confusion — Part 10, discussing AI's impact on APM and Observability.

Pete Goldin is Editor and Publisher of APMdigest

The Latest

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 3 covers more predictions about Observability ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 2 covers predictions about Observability and AIOps ...

The Holiday Season means it is time for APMdigest's annual list of predictions, covering Observability and other IT performance topics. Industry experts — from analysts and consultants to the top vendors — offer thoughtful, insightful, and often controversial predictions on how Observability, AIOps, APM and related technologies will evolve and impact business in 2026 ...

IT organizations are preparing for 2026 with increased expectations around modernization, cloud maturity, and data readiness. At the same time, many teams continue to operate with limited staffing and are trying to maintain complex environments with small internal groups. These conditions are creating a distinct set of priorities for the year ahead. The DataStrike 2026 Data Infrastructure Survey Report, based on responses from nearly 280 IT leaders across industries, points to five trends that are shaping data infrastructure planning for 2026 ...

Developers building AI applications are not just looking for fault patterns after deployment; they must detect issues quickly during development and have the ability to prevent issues after going live. Unfortunately, traditional observability tools can no longer meet the needs of AI-driven enterprise application development. AI-powered detection and auto-remediation tools designed to keep pace with rapid development are now emerging to proactively manage performance and prevent downtime ...

Every few years, the cybersecurity industry adopts a new buzzword. "Zero Trust" has endured longer than most — and for good reason. Its promise is simple: trust nothing by default, verify everything continuously. Yet many organizations still hesitate to implement Zero Trust Network Access (ZTNA). The problem isn't that ZTNA doesn't work. It's that it's often misunderstood ...

For many retail brands, peak season is the annual stress test of their digital infrastructure. It's also when often technical dashboards glow green, yet customer feedback, digital experience frustration, and conversion trends tell a different story entirely. Over the past several years, we've seen the same pattern across retail, financial services, travel, and media: internal application performance metrics fail to capture the true experience of users connecting over local broadband, mobile carriers, and congested networks using multiple devices across geographies ...

PostgreSQL promises greater flexibility, performance, and cost savings compared to proprietary alternatives. But successfully deploying it isn't always straightforward, and there are some hidden traps along the way that even seasoned IT leaders can stumble into. In this blog, I'll highlight five of the most common pitfalls with PostgreSQL deployment and offer guidance on how to avoid them, along with the best path forward ...

The rise of hybrid cloud environments, the explosion of IoT devices, the proliferation of remote work, and advanced cyber threats have created a monitoring challenge that traditional approaches simply cannot meet. IT teams find themselves drowning in a sea of data, struggling to identify critical threats amidst a deluge of alerts, and often reacting to incidents long after they've begun. This is where AI and ML are leveraged ...

Three practices, chaos testing, incident retrospectives, and AIOps-driven monitoring, are transforming platform teams from reactive responders into proactive builders of resilient, self-healing systems. The evolution is not just technical; it's cultural. The modern platform engineer isn't just maintaining infrastructure. They're product owners designing for reliability, observability, and continuous improvement ...

APM and Observability: Cutting Through the Confusion — Part 9

Pete Goldin
APMdigest

The story of the evolution of Observability to encompass APM and other IT performance management capabilities would not be complete without discussing the monumental impact of open source.

Start with: APM and Observability - Cutting Through the Confusion - Part 8

Open source is transforming how organizations approach APM and observability by providing vendor neutral standards for collecting and exporting telemetry types, says Mimi Shalash, Observability Advisor at Splunk, a Cisco Company.

Solutions like OpenTelemetry simplify integration across platforms, reduce vendor lock-in, and improve interoperability in complex environments, Shalash continues. Prometheus enhances this approach with robust metrics and alerting, especially systems like Kubernetes. And together these tools enable flexible, cost-effective stacks designed to scale and evolve with modern infrastructure.

“Open source tools like OpenTelemetry and Prometheus are becoming essential building blocks for observability in modern, cloud-native environments,” explains Andreas Grabner, Fellow DevRel and CNCF Ambassador, Dynatrace. “They empower organizations with greater flexibility and standardization in how telemetry data is collected. The broader industry trend is moving toward interoperability and data unification—using open standards for collection while relying on more advanced platforms to contextualize, analyze and act on that data at scale. This hybrid model allows teams to preserve their existing investments in open source while benefiting from automation, AI and enterprise grade observability.”

“The observability space is a prime target for OSS,” Sven Delmas, VP of Research at Mezmo, agrees. “Between dealing with a tech-savvy and curious audience, constant pressure on cost control, and the need for transparency and avoiding vendor lock-in, there has been — and will be — an ever-increasing push to OSS.”

Driving Observability's Evolution

Open source is changing the center of gravity in observability from tools to telemetry, according to Brian Douglas, Head of Ecosystem, Cloud Native Computing Foundation (CNCF). Developers are adopting Prometheus, OpenTelemetry, and Fluent Bit not just because they're free or flexible, but because they represent an open, portable foundation. These tools make it easier to switch vendors, build internal platforms, and innovate on top of shared standards. They're not just part of the observability conversation; they're shaping the future of how observability is defined.

APM is one specific implementation of observability, not its full scope, Douglas continues. It answers questions like, 'Is this app performing within expected parameters?' Observability, in contrast, supports deeper exploration: 'Why did latency spike in a downstream service for certain regions?' Projects like Prometheus and OpenTelemetry enable this broader context by collecting high-dimensional metrics, distributed traces, and logs which gives teams the raw, interoperable data needed to connect the dots.”

Observability supports cross-signal correlation and open-ended investigation, Douglas adds. Rather than focusing solely on applications, it lets teams visualize the full stack, from container runtimes and infrastructure to network topology and business-level SLIs.

  • Prometheus provides robust, flexible metrics, while Cortex scales them across environments.
  • Fluent Bit and Fluentd handle log aggregation and routing across edge and core environments.
  • OpenTelemetry standardizes telemetry collection and enriches it with context, making it easier for tools and teams to interoperate without reinventing the wheel.

“What's important is interoperability,” Douglas of CNCF explains. “With standards like OpenTelemetry and protocols like the Prometheus exposition format, teams can adopt a modular approach: instrument once, analyze anywhere. This lets them use best-in-class components rather than be locked into a monolithic solution. Observability isn't a single tool, it's a strategy backed by open, composable tooling.”

With OpenTelemetry, users can build a composable observability stack where each tool plays to its strengths: one might excel at exploratory debugging, another at automated root cause analysis, and a third at cost-effective long-term storage, Severin Neumann, Head of Community & Developer Relations at Causely, elaborates. This flexibility lets teams get the best outcomes for their specific needs without duplicating instrumentation or locking themselves into a one-size-fits-all solution.

OpenTelemetry: Reshaping APM and Observability

OpenTelemetry, in particular, has already profoundly reshaped the APM market and the broader observability field, explains Juraci Paixão Kröhling, Software Engineer at OllyGarden. “Initially, some established players might have overlooked it, but strong customer demand has made OpenTelemetry support almost table stakes now; it's rare to find a vendor unable to ingest the standard OTLP format.”

OpenTelemetry is an open source standard, framework and suite of tools facilitating the generation, collection, and exporting of telemetry data.

“OpenTelemetry is having a huge impact on the industry with studies showing that nearly half of organizations polled are using OpenTelemetry with another 25-percent-plus looking to adopt in the near term,” says Harald Burose, Director, Product Management, Research & Development – Engineering, OpenText.

Download the EMA Report: Taking Observability to the Next Level - OpenTelemetry’s Emerging Role in IT Performance and Reliability

Kröhling from OllyGarden continues, “I expect vendors will increasingly embrace OpenTelemetry more natively, treating its semantic conventions not just as data points but as first-class citizens for richer understanding. The era of requiring proprietary agents for basic data collection is closing; customers now expect tools not only to handle open formats but to do so meaningfully, respecting the common language defined by standards like OpenTelemetry. This shared foundation allows everyone to cultivate better systems.”

OpenTelemetry provides teams with greater flexibility, standardization, and control over their telemetry data and has become the de facto standard for data ingestion, according to Bahubali Shetti, Senior Director, Product Marketing, Elastic. Whether deployments use standard OTel SDKs, auto-instrumentation, OTel Collectors, or a combination of these, users can avoid vendor lock-in and reduce the need for future retooling.

“OpenTelemetry isn't just shaping the future of observability, it's quickly becoming the standard that modern, scalable systems are built on,” concludes Shalash from Splunk.

Go to: APM and Observability: Cutting Through the Confusion — Part 10, discussing AI's impact on APM and Observability.

Pete Goldin is Editor and Publisher of APMdigest

The Latest

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 3 covers more predictions about Observability ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2025. Part 2 covers predictions about Observability and AIOps ...

The Holiday Season means it is time for APMdigest's annual list of predictions, covering Observability and other IT performance topics. Industry experts — from analysts and consultants to the top vendors — offer thoughtful, insightful, and often controversial predictions on how Observability, AIOps, APM and related technologies will evolve and impact business in 2026 ...

IT organizations are preparing for 2026 with increased expectations around modernization, cloud maturity, and data readiness. At the same time, many teams continue to operate with limited staffing and are trying to maintain complex environments with small internal groups. These conditions are creating a distinct set of priorities for the year ahead. The DataStrike 2026 Data Infrastructure Survey Report, based on responses from nearly 280 IT leaders across industries, points to five trends that are shaping data infrastructure planning for 2026 ...

Developers building AI applications are not just looking for fault patterns after deployment; they must detect issues quickly during development and have the ability to prevent issues after going live. Unfortunately, traditional observability tools can no longer meet the needs of AI-driven enterprise application development. AI-powered detection and auto-remediation tools designed to keep pace with rapid development are now emerging to proactively manage performance and prevent downtime ...

Every few years, the cybersecurity industry adopts a new buzzword. "Zero Trust" has endured longer than most — and for good reason. Its promise is simple: trust nothing by default, verify everything continuously. Yet many organizations still hesitate to implement Zero Trust Network Access (ZTNA). The problem isn't that ZTNA doesn't work. It's that it's often misunderstood ...

For many retail brands, peak season is the annual stress test of their digital infrastructure. It's also when often technical dashboards glow green, yet customer feedback, digital experience frustration, and conversion trends tell a different story entirely. Over the past several years, we've seen the same pattern across retail, financial services, travel, and media: internal application performance metrics fail to capture the true experience of users connecting over local broadband, mobile carriers, and congested networks using multiple devices across geographies ...

PostgreSQL promises greater flexibility, performance, and cost savings compared to proprietary alternatives. But successfully deploying it isn't always straightforward, and there are some hidden traps along the way that even seasoned IT leaders can stumble into. In this blog, I'll highlight five of the most common pitfalls with PostgreSQL deployment and offer guidance on how to avoid them, along with the best path forward ...

The rise of hybrid cloud environments, the explosion of IoT devices, the proliferation of remote work, and advanced cyber threats have created a monitoring challenge that traditional approaches simply cannot meet. IT teams find themselves drowning in a sea of data, struggling to identify critical threats amidst a deluge of alerts, and often reacting to incidents long after they've begun. This is where AI and ML are leveraged ...

Three practices, chaos testing, incident retrospectives, and AIOps-driven monitoring, are transforming platform teams from reactive responders into proactive builders of resilient, self-healing systems. The evolution is not just technical; it's cultural. The modern platform engineer isn't just maintaining infrastructure. They're product owners designing for reliability, observability, and continuous improvement ...