A Guide to OpenTelemetry - Part 7: OTel and AIOps
October 26, 2022

Pete Goldin

Share this

Just as questions arise about how Application Performance Management (APM) and OpenTelemetry impact each other, we also want to talk about the relationship between AIOps and OpenTelemetry.

Start with: A Guide to OpenTelemetry — Part 1

Start with: A Guide to OpenTelemetry — Part 2: When Will OTel Be Ready?

Start with: A Guide to OpenTelemetry — Part 3: The Advantages

Start with: A Guide to OpenTelemetry — Part 4: The Results

Start with: A Guide to OpenTelemetry — Part 5: The Challenges

Start with: A Guide to OpenTelemetry — Part 6: OTel and APM

OpenTelemetry Supports AIOps

Similar to points made in the previous blog about OpenTelemetry and APM, OpenTelemetry can also serve as a helpful support to AIOps.

"OpenTelemetry is a data source to AIOps tools," says Jonah Kowall, CTO of Logz.io. "It can also normalize and correlate signals to one another, making it more useful to AIOps solutions which attempt to correlate that data."

Torsten Volk, Managing Research Director, Containers, DevOps, Machine Learning and Artificial Intelligence, at Enterprise Management Associates (EMA), agrees: "OpenTelemetry is critical to enable AIOPs to ingest telemetry data from distributed cloud native applications that are often ephemeral, highly scalable, and can easily move between clouds."

Mike Loukides, VP of Emerging Tech Content at O'Reilly Media, clarifies that whether or not you are using AI, if you're automating anything, your automation systems will need standard data formats. "If your web server, your database, and a few hundred microservices are all sending data that's structured differently, you have a problem. That doesn't mean that you can't write an automated system, but it does mean that you're going to spend most of your time dealing with the different data formats rather than writing code to automate your systems. Standardizing on OpenTelemetry solves this problem: you have a single way to send data, and a single set of libraries to receive it."

Contextual Information is Key

OpenTelemetry's appeal in the AIOps use case comes back to the breadth of coverage and the value of the data.

"OpenTelemetry is an enabler of AIOps," says Sajai Krishnan, General Manager, Observability, Elastic. "We all know that ML/AI algorithms LOVE data, but it is not the volume of data that matters. What matters is the relevance of the data and the context shared across traces, metrics, and logs."

Download the 2022 Gartner Magic Quadrant for APM and Observability

Because all telemetry signals are generated using the same source/agent, this brings built in contextual information across telemetry signals right from the source, notes Nitin Navare, CTO of LogicMonitor, adding, "Thus, OpenTelemetry will compliment AIOps in the long run as AI backends will have more contextual information to learn about underlying IT assets."

Daniel Khan, Director of Product Management (Telemetry) at Sentry, adds:
"AIOps relies on high-fidelity, contextual data, hence OpenTelemetry can improve the quality of insights provided by AIOps."

OpenTelemetry provides a framework for engineering teams to correlate their observability data between infrastructure and application and also between logs, metrics, and traces, according to Marc Chipouras, Grafana Labs Senior Director, Engineering. "This linked structure allows our AIOps teams to analyze all the data generated from production systems together rather than independently. The connected datasets change the problem set, allowing AIOps tools to understand the whole system rather than subsets of services or workflows."

OpenTelemetry also provides a way to collect hard-to-reach performance data. For example, the OpenTelemetry Collector can be used for aggregating and processing data on the edge, making the collector an intelligent part of the AIOps toolset, says Marcin "Perk" Stożek, Software Engineering Manager of Open Source Collection, Sumo Logic.

Delivering the Right Data

"By providing standard ways to pull in logs, metrics and trace data, OpenTelemetry ensures that ML algorithms have the right signals and rich contextual attributes to build accurate models and make accurate predictions about what is wrong inside your enterprise IT estate," says Krishnan from Elastic. "The correct data helps make better decisions and deliver remediation, especially if those decisions are automated."

"Imagine taking an automated action based on a false positive alert," he adds. "It could be a disaster for your business. Improving the accuracy of the machine learning models by using the correct consolidated and correlated data becomes critical to any action taken."

"An entire application ecosystem has emerged around OpenTelemetry," Krishnan concludes. "Kubernetes now has support for OpenTelemetry, for example, and this will continue to grow as more apps can use OpenTelemetry data. Imagine the possibilities for AIOps as automation tools start to plug into this data. For example, software-defined networks can start to make use of application telemetry data and traces from any source to re-route traffic or automatically improve bandwidth for specific applications delivering a great customer experience."

AIOps Challenges

Martin Thwaites, Developer Advocate at Honeycomb, agrees that OpenTelemetry can be configured with some AIOps solutions for automated responses to detected issues, but he warns not to overestimate the power of the combination: "It is important to note, however, that monitoring and observability can be complex and still requires human intervention. For example, an AI model may detect slower runtimes on a website. This could be the result of heavy bot traffic, or maybe you are having a sale on your website that has led to a sharp spike in visitors. OpenTelemetry can be incredibly powerful, but users should be careful not to slip into a 'set it and forget it' approach."

Check back tomorrow for the final installment, A Guide to OpenTelemetry Part 8, offering expert recommendations on how to get started.

Go to: A Guide to OpenTelemetry — Part 8: Getting Started

Pete Goldin is Editor and Publisher of APMdigest
Share this

The Latest

December 07, 2023

Part 4 covers OpenTelemetry: Next year, we're going to see more embrace of OpenTelemetry across the entire industry — opening up the future of instrumentation ...

December 06, 2023

Part 3 covers even more on Observability: Observability will move up the organization to support the sustainability and FinOps drive. The combined pressure of needing to adopt more sustainable practices and tackle rising cloud costs will catapult observability from an IT priority to a business requirement in 2024 ...

December 05, 2023

Part 2 covers more on Observability: In 2024, observability platforms will embrace and innovate with new technologies like GenAI for real-time analytics, becoming the fulcrum for digital experience management ...

December 04, 2023

The Holiday Season means it is time for APMdigest's annual list of Application Performance Management (APM) predictions, covering IT performance topics. Industry experts — from analysts and consultants to the top vendors — offer thoughtful, insightful, and often controversial predictions on how APM, Observability, AIOps and related technologies will evolve and impact business in 2024. Part 1 covers APM and Observability ...

November 30, 2023

To help you stay on top of the ever-evolving tech scene, Automox IT experts shake the proverbial magic eight ball and share their predictions about tech trends in the coming year. From M&A frenzies to sustainable tech and automation, these forecasts paint an exciting picture of the future ...

November 29, 2023
The past few years have presented numerous challenges for businesses: a pandemic, rising interest rates, supply chain disruptions, and geopolitical conflict that sent shockwaves across the global economy. But change may finally be on the horizon. According to a recent report by Endava ... a majority of executives confirmed they are feeling optimistic about the current business climate, and as a result, are forecasting larger IT budgets, increased technology funding and rollout, and prioritized innovation in the coming year ...
November 28, 2023

Incident management processes are not keeping pace with the demands of modern operations teams, failing to meet the needs of SREs as well as platform and ops teams. Results from the State of DevOps Automation and AI Survey, commissioned by Transposit, point to an incident management paradox. Despite nearly 60% of ITOps and DevOps professionals reporting they have a defined incident management process that's fully documented in one place and over 70% saying they have a level of automation that meets their needs, teams are unable to quickly resolve incidents ...

November 27, 2023

Today, in the world of enterprise technology, the challenges posed by legacy Virtual Desktop Infrastructure (VDI) systems have long been a source of concern for IT departments. In many instances, this promising solution has become an organizational burden, hindering progress, depleting resources, and taking a psychological and operational toll on employees ...

November 22, 2023

Within retail organizations across the world, IT teams will be bracing themselves for a hectic holiday season ... While this is an exciting opportunity for retailers to boost sales, it also intensifies severe risk. Any application performance slipup will cause consumers to turn their back on brands, possibly forever. Online shoppers will be completely unforgiving to any retailer who doesn't deliver a seamless digital experience ...

November 21, 2023

Black Friday is a time when consumers can cash in on some of the biggest deals retailers offer all year long ... Nearly two-thirds of consumers utilize a retailer's web and mobile app for holiday shopping, raising the stakes for competitors to provide the best online experience to retain customer loyalty. Perforce's 2023 Black Friday survey sheds light on consumers' expectations this time of year and how developers can properly prepare their applications for increased online traffic ...