Skip to main content

A Guide to OpenTelemetry - Part 7: OTel and AIOps

Pete Goldin
APMdigest

Just as questions arise about how Application Performance Management (APM) and OpenTelemetry impact each other, we also want to talk about the relationship between AIOps and OpenTelemetry.

Start with: A Guide to OpenTelemetry — Part 1

Start with: A Guide to OpenTelemetry — Part 2: When Will OTel Be Ready?

Start with: A Guide to OpenTelemetry — Part 3: The Advantages

Start with: A Guide to OpenTelemetry — Part 4: The Results

Start with: A Guide to OpenTelemetry — Part 5: The Challenges

Start with: A Guide to OpenTelemetry — Part 6: OTel and APM

OpenTelemetry Supports AIOps

Similar to points made in the previous blog about OpenTelemetry and APM, OpenTelemetry can also serve as a helpful support to AIOps.

"OpenTelemetry is a data source to AIOps tools," says Jonah Kowall, CTO of Logz.io. "It can also normalize and correlate signals to one another, making it more useful to AIOps solutions which attempt to correlate that data."

Torsten Volk, Managing Research Director, Containers, DevOps, Machine Learning and Artificial Intelligence, at Enterprise Management Associates (EMA), agrees: "OpenTelemetry is critical to enable AIOPs to ingest telemetry data from distributed cloud native applications that are often ephemeral, highly scalable, and can easily move between clouds."

Mike Loukides, VP of Emerging Tech Content at O'Reilly Media, clarifies that whether or not you are using AI, if you're automating anything, your automation systems will need standard data formats. "If your web server, your database, and a few hundred microservices are all sending data that's structured differently, you have a problem. That doesn't mean that you can't write an automated system, but it does mean that you're going to spend most of your time dealing with the different data formats rather than writing code to automate your systems. Standardizing on OpenTelemetry solves this problem: you have a single way to send data, and a single set of libraries to receive it."

Contextual Information Is Key

OpenTelemetry's appeal in the AIOps use case comes back to the breadth of coverage and the value of the data.

"OpenTelemetry is an enabler of AIOps," says Sajai Krishnan, General Manager, Observability, Elastic. "We all know that ML/AI algorithms LOVE data, but it is not the volume of data that matters. What matters is the relevance of the data and the context shared across traces, metrics, and logs."

Download the 2022 Gartner Magic Quadrant for APM and Observability

Because all telemetry signals are generated using the same source/agent, this brings built in contextual information across telemetry signals right from the source, notes Nitin Navare, CTO of LogicMonitor, adding, "Thus, OpenTelemetry will compliment AIOps in the long run as AI backends will have more contextual information to learn about underlying IT assets."

Daniel Khan, Director of Product Management (Telemetry) at Sentry, adds:
"AIOps relies on high-fidelity, contextual data, hence OpenTelemetry can improve the quality of insights provided by AIOps."

OpenTelemetry provides a framework for engineering teams to correlate their observability data between infrastructure and application and also between logs, metrics, and traces, according to Marc Chipouras, Grafana Labs Senior Director, Engineering. "This linked structure allows our AIOps teams to analyze all the data generated from production systems together rather than independently. The connected datasets change the problem set, allowing AIOps tools to understand the whole system rather than subsets of services or workflows."

OpenTelemetry also provides a way to collect hard-to-reach performance data. For example, the OpenTelemetry Collector can be used for aggregating and processing data on the edge, making the collector an intelligent part of the AIOps toolset, says Marcin "Perk" Stożek, Software Engineering Manager of Open Source Collection, Sumo Logic.

Delivering the Right Data

"By providing standard ways to pull in logs, metrics and trace data, OpenTelemetry ensures that ML algorithms have the right signals and rich contextual attributes to build accurate models and make accurate predictions about what is wrong inside your enterprise IT estate," says Krishnan from Elastic. "The correct data helps make better decisions and deliver remediation, especially if those decisions are automated."

"Imagine taking an automated action based on a false positive alert," he adds. "It could be a disaster for your business. Improving the accuracy of the machine learning models by using the correct consolidated and correlated data becomes critical to any action taken."

"An entire application ecosystem has emerged around OpenTelemetry," Krishnan concludes. "Kubernetes now has support for OpenTelemetry, for example, and this will continue to grow as more apps can use OpenTelemetry data. Imagine the possibilities for AIOps as automation tools start to plug into this data. For example, software-defined networks can start to make use of application telemetry data and traces from any source to re-route traffic or automatically improve bandwidth for specific applications delivering a great customer experience."

AIOps Challenges

Martin Thwaites, Developer Advocate at Honeycomb, agrees that OpenTelemetry can be configured with some AIOps solutions for automated responses to detected issues, but he warns not to overestimate the power of the combination: "It is important to note, however, that monitoring and observability can be complex and still requires human intervention. For example, an AI model may detect slower runtimes on a website. This could be the result of heavy bot traffic, or maybe you are having a sale on your website that has led to a sharp spike in visitors. OpenTelemetry can be incredibly powerful, but users should be careful not to slip into a 'set it and forget it' approach."

Go to: A Guide to OpenTelemetry — Part 8: Getting Started

Pete Goldin is Editor and Publisher of APMdigest

Hot Topics

The Latest

The enterprises that will define the next decade are not the ones that deployed the most technology. They are the ones who understood what their technology was actually doing. That distinction is not a philosophical point. It is the central operational challenge facing every organization that has spent the last five years modernizing at speed ...

AI is becoming the operating system of the enterprise. It acts as an invisible coordination layer that understands intent, connects systems, and executes work across complex SaaS environments. Previously, employees had to click through multiple systems — CRM, ERP, support tools, collaboration platforms — to complete a single task. Now, instead of navigating each application manually, they can simply state what they need to accomplish ...

In 2026, the cost of downtime or an outage is no longer just a technical inconvenience; it's a $600 billion wake up call for global businesses. As our digital ecosystems become  more interconnected, each touchpoint introduces new risks and multiplies the consequences when things go wrong. And the data is clear: aggregate downtime costs  for Global 2,000 companies have surged 50% since 2024, reaching a staggering $600 billion ...

Deloitte found that 74% of enterprises expect to deploy agentic AI solutions in the next 24 months. However, the rush to deployment is outpacing foundational work, though. Only 21% of enterprises have fully formed agent governance models in place. The result? AI agents deployed without guidance or governance begin to function as fragmented islands of complexity ...

Cloud spending is no longer viewed as a passthrough IT expense, but as a strategic financial lever that directly impacts innovation capacity, profitability and enterprise resilience, according to the CFO Cloud Cost Optimization Report from Azul ...

As AI moves from generating responses to performing actions, the need for trust increases exponentially. And as organizations enlist AI agents for increasingly sophisticated business processes, trust is going to be the single most important theme for spurring adoption. What can organizations do to build trustworthy AI agents? ...

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

A Guide to OpenTelemetry - Part 7: OTel and AIOps

Pete Goldin
APMdigest

Just as questions arise about how Application Performance Management (APM) and OpenTelemetry impact each other, we also want to talk about the relationship between AIOps and OpenTelemetry.

Start with: A Guide to OpenTelemetry — Part 1

Start with: A Guide to OpenTelemetry — Part 2: When Will OTel Be Ready?

Start with: A Guide to OpenTelemetry — Part 3: The Advantages

Start with: A Guide to OpenTelemetry — Part 4: The Results

Start with: A Guide to OpenTelemetry — Part 5: The Challenges

Start with: A Guide to OpenTelemetry — Part 6: OTel and APM

OpenTelemetry Supports AIOps

Similar to points made in the previous blog about OpenTelemetry and APM, OpenTelemetry can also serve as a helpful support to AIOps.

"OpenTelemetry is a data source to AIOps tools," says Jonah Kowall, CTO of Logz.io. "It can also normalize and correlate signals to one another, making it more useful to AIOps solutions which attempt to correlate that data."

Torsten Volk, Managing Research Director, Containers, DevOps, Machine Learning and Artificial Intelligence, at Enterprise Management Associates (EMA), agrees: "OpenTelemetry is critical to enable AIOPs to ingest telemetry data from distributed cloud native applications that are often ephemeral, highly scalable, and can easily move between clouds."

Mike Loukides, VP of Emerging Tech Content at O'Reilly Media, clarifies that whether or not you are using AI, if you're automating anything, your automation systems will need standard data formats. "If your web server, your database, and a few hundred microservices are all sending data that's structured differently, you have a problem. That doesn't mean that you can't write an automated system, but it does mean that you're going to spend most of your time dealing with the different data formats rather than writing code to automate your systems. Standardizing on OpenTelemetry solves this problem: you have a single way to send data, and a single set of libraries to receive it."

Contextual Information Is Key

OpenTelemetry's appeal in the AIOps use case comes back to the breadth of coverage and the value of the data.

"OpenTelemetry is an enabler of AIOps," says Sajai Krishnan, General Manager, Observability, Elastic. "We all know that ML/AI algorithms LOVE data, but it is not the volume of data that matters. What matters is the relevance of the data and the context shared across traces, metrics, and logs."

Download the 2022 Gartner Magic Quadrant for APM and Observability

Because all telemetry signals are generated using the same source/agent, this brings built in contextual information across telemetry signals right from the source, notes Nitin Navare, CTO of LogicMonitor, adding, "Thus, OpenTelemetry will compliment AIOps in the long run as AI backends will have more contextual information to learn about underlying IT assets."

Daniel Khan, Director of Product Management (Telemetry) at Sentry, adds:
"AIOps relies on high-fidelity, contextual data, hence OpenTelemetry can improve the quality of insights provided by AIOps."

OpenTelemetry provides a framework for engineering teams to correlate their observability data between infrastructure and application and also between logs, metrics, and traces, according to Marc Chipouras, Grafana Labs Senior Director, Engineering. "This linked structure allows our AIOps teams to analyze all the data generated from production systems together rather than independently. The connected datasets change the problem set, allowing AIOps tools to understand the whole system rather than subsets of services or workflows."

OpenTelemetry also provides a way to collect hard-to-reach performance data. For example, the OpenTelemetry Collector can be used for aggregating and processing data on the edge, making the collector an intelligent part of the AIOps toolset, says Marcin "Perk" Stożek, Software Engineering Manager of Open Source Collection, Sumo Logic.

Delivering the Right Data

"By providing standard ways to pull in logs, metrics and trace data, OpenTelemetry ensures that ML algorithms have the right signals and rich contextual attributes to build accurate models and make accurate predictions about what is wrong inside your enterprise IT estate," says Krishnan from Elastic. "The correct data helps make better decisions and deliver remediation, especially if those decisions are automated."

"Imagine taking an automated action based on a false positive alert," he adds. "It could be a disaster for your business. Improving the accuracy of the machine learning models by using the correct consolidated and correlated data becomes critical to any action taken."

"An entire application ecosystem has emerged around OpenTelemetry," Krishnan concludes. "Kubernetes now has support for OpenTelemetry, for example, and this will continue to grow as more apps can use OpenTelemetry data. Imagine the possibilities for AIOps as automation tools start to plug into this data. For example, software-defined networks can start to make use of application telemetry data and traces from any source to re-route traffic or automatically improve bandwidth for specific applications delivering a great customer experience."

AIOps Challenges

Martin Thwaites, Developer Advocate at Honeycomb, agrees that OpenTelemetry can be configured with some AIOps solutions for automated responses to detected issues, but he warns not to overestimate the power of the combination: "It is important to note, however, that monitoring and observability can be complex and still requires human intervention. For example, an AI model may detect slower runtimes on a website. This could be the result of heavy bot traffic, or maybe you are having a sale on your website that has led to a sharp spike in visitors. OpenTelemetry can be incredibly powerful, but users should be careful not to slip into a 'set it and forget it' approach."

Go to: A Guide to OpenTelemetry — Part 8: Getting Started

Pete Goldin is Editor and Publisher of APMdigest

Hot Topics

The Latest

The enterprises that will define the next decade are not the ones that deployed the most technology. They are the ones who understood what their technology was actually doing. That distinction is not a philosophical point. It is the central operational challenge facing every organization that has spent the last five years modernizing at speed ...

AI is becoming the operating system of the enterprise. It acts as an invisible coordination layer that understands intent, connects systems, and executes work across complex SaaS environments. Previously, employees had to click through multiple systems — CRM, ERP, support tools, collaboration platforms — to complete a single task. Now, instead of navigating each application manually, they can simply state what they need to accomplish ...

In 2026, the cost of downtime or an outage is no longer just a technical inconvenience; it's a $600 billion wake up call for global businesses. As our digital ecosystems become  more interconnected, each touchpoint introduces new risks and multiplies the consequences when things go wrong. And the data is clear: aggregate downtime costs  for Global 2,000 companies have surged 50% since 2024, reaching a staggering $600 billion ...

Deloitte found that 74% of enterprises expect to deploy agentic AI solutions in the next 24 months. However, the rush to deployment is outpacing foundational work, though. Only 21% of enterprises have fully formed agent governance models in place. The result? AI agents deployed without guidance or governance begin to function as fragmented islands of complexity ...

Cloud spending is no longer viewed as a passthrough IT expense, but as a strategic financial lever that directly impacts innovation capacity, profitability and enterprise resilience, according to the CFO Cloud Cost Optimization Report from Azul ...

As AI moves from generating responses to performing actions, the need for trust increases exponentially. And as organizations enlist AI agents for increasingly sophisticated business processes, trust is going to be the single most important theme for spurring adoption. What can organizations do to build trustworthy AI agents? ...

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...