Observability: The Next Frontier for AIOps
September 24, 2020

Will Cappelli
Moogsoft

Share this

Enterprise ITOM and ITSM teams have been welcoming of AIOps, believing that it has the potential to deliver great value to them as their IT environments become more distributed, hybrid and complex. Not so with DevOps teams.

Listen to Will Cappelli discuss AIOps and Observability on the AI+ITOPS Podcast

It's safe to say they've kept AIOps at arm's length, because they don't think it's relevant nor useful for what they do. Instead, to manage the software code they develop and deploy, they've focused on observability.

In concrete terms, this means that for your typical DevOps pros, if the app delivered to their production environment is observable, that's all they need. They're skeptical of what, if anything, AIOps can contribute in this scenario.

This blog will explain why AIOps can help DevOps teams manage their environments with unprecedented accuracy and velocity, and outline the benefits of combining AIOps with observability.


AIOps: Room to Grow its Adoption and Functionality

In truth, there isn't one universally effective set of metrics that works for every team to measure the value that AIOps delivers. This is an issue not just for AIOps but for many ITOM and ITSM technologies as well. In fact, many enterprise IT teams who invested in AIOps in recent years are now carefully watching their deployments to assess their value before deciding whether or not to expand on them.

Still, there's a lot of room for AIOps adoption to grow, because there are many enterprises that haven't adopted it at all. That's why many vendors are trying to position themselves as AIOps players, to be part of a growing market. For this reason, the AIOps market has now gotten crowded.

So how can AIOps as a practice innovate and evolve at this point? What AIOps innovations can deliver unique capabilities that will set it apart from the pack of existing varieties? Clearly, the way to do this is to tailor, expand and apply AI-functionality to observability data. Such a solution would appeal strongly to the DevOps community, and dissolve its historical reluctance and skepticism towards AIOps.

But What is Observability?

However, there's an issue. When you press DevOps pros a little bit and ask them what observability is, you get three very different answers. The first is that observability is nothing more than traditional monitoring applied to a DevOps environment and toolset. This is flat out wrong.

Another meaning you'll hear given to observability is its traditional one: That it's a property of the system being monitored. In other words, observability isn't about the technology doing the monitoring or the observing, but rather it's the self-descriptive data a system generates.

According to this definition, people monitoring these systems can obtain an accurate picture of the changes occurring in them and of their causal relationships. However, it's clear that this view of observability, while related to the second one, is a dead end. It's just a stream of raw data and nothing else.

A third definition is that, compared with traditional monitoring, observability is a fundamentally different way of looking at and getting data from the environment being managed. And it needs to be, because the DevOps world is one of continuous integration, continuous delivery and continuous change — a world that's highly componentized and dynamic.

The way traditional monitoring tools take data from an environment, filter it, and generate events isn't appropriate for DevOps. You need to observe changes that happen so quickly that trying to fit the data into any kind of pre-arranged structure just falls short. You won't be able to see what's going on in the environment.

Instead, DevOps teams need to access the raw data generated by their toolset and environment, and perform analytics directly on it. That raw data is made up of metrics, traces, logs and events. So observability is indeed a revolution, a drastic shift away from all the pre-built filters and the pre-packaged models of traditional monitoring systems.

This definition is the one that serves up a potential for technological innovation and for delivering the most value through AIOps, because DevOps teams do need help to make sense of this raw data stream, and act accordingly.

AI analysis and automation applied to observability can deliver this assistance to DevOps teams. Such an approach would take the raw data from the DevOps environment and give DevOps practitioners an understanding of the systems that they're developing and delivering.

With these insights, DevOps teams can more effectively decide on actions to fix problems, or to improve performance.

So what's involved in combining AIOps and observability?

Metrics, traces, logs and events must first be collected and analyzed. Metrics captures a temporal dimension of what's happening, through its time-series data. Traces map a path through a topology, so they provide a spatial dimension -- a trace is a chain of execution across different system components, usually microservices. Logs and events provide a record of unstructured events.

With AIOps analysis, metrics reveal anomalies, traces show topology-based microservice relationships, and unstructured logs and events provide the foundation for triggering a significant alert.

Machine learning algorithms would then come into play to indicate an uncommon occurrence, pinpoint unusual metrics, traces, logs and events, and correlate them using temporal, spatial and textual criteria. The next step in the process would be the identification of a probable root cause of the problem, based on the history of previously resolved incidents. Then, ideally, automated remedial actions would be carried out.

Clearly, this combination of AIOps and observability would offer tremendous value to DevOps teams, as it would automate the detection, diagnosis and remediation of problems with the speed and accuracy required in their CI/CD environments. This would represent a breakthrough for AIOps: Earning the appreciation of reticent DevOps teams by giving them deep insights into observability data, and unparalleled visibility into their environments.

Will Cappelli is Field CTO at Moogsoft
Share this

The Latest

April 24, 2024

Over the last 20 years Digital Employee Experience has become a necessity for companies committed to digital transformation and improving IT experiences. In fact, by 2025, more than 50% of IT organizations will use digital employee experience to prioritize and measure digital initiative success ...

April 23, 2024

While most companies are now deploying cloud-based technologies, the 2024 Secure Cloud Networking Field Report from Aviatrix found that there is a silent struggle to maximize value from those investments. Many of the challenges organizations have faced over the past several years have evolved, but continue today ...

April 22, 2024

In our latest research, Cisco's The App Attention Index 2023: Beware the Application Generation, 62% of consumers report their expectations for digital experiences are far higher than they were two years ago, and 64% state they are less forgiving of poor digital services than they were just 12 months ago ...

April 19, 2024

In MEAN TIME TO INSIGHT Episode 5, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the network source of truth ...

April 18, 2024

A vast majority (89%) of organizations have rapidly expanded their technology in the past few years and three quarters (76%) say it's brought with it increased "chaos" that they have to manage, according to Situation Report 2024: Managing Technology Chaos from Software AG ...

April 17, 2024

In 2024 the number one challenge facing IT teams is a lack of skilled workers, and many are turning to automation as an answer, according to IT Trends: 2024 Industry Report ...

April 16, 2024

Organizations are continuing to embrace multicloud environments and cloud-native architectures to enable rapid transformation and deliver secure innovation. However, despite the speed, scale, and agility enabled by these modern cloud ecosystems, organizations are struggling to manage the explosion of data they create, according to The state of observability 2024: Overcoming complexity through AI-driven analytics and automation strategies, a report from Dynatrace ...

April 15, 2024

Organizations recognize the value of observability, but only 10% of them are actually practicing full observability of their applications and infrastructure. This is among the key findings from the recently completed Logz.io 2024 Observability Pulse Survey and Report ...

April 11, 2024

Businesses must adopt a comprehensive Internet Performance Monitoring (IPM) strategy, says Enterprise Management Associates (EMA), a leading IT analyst research firm. This strategy is crucial to bridge the significant observability gap within today's complex IT infrastructures. The recommendation is particularly timely, given that 99% of enterprises are expanding their use of the Internet as a primary connectivity conduit while facing challenges due to the inefficiency of multiple, disjointed monitoring tools, according to Modern Enterprises Must Boost Observability with Internet Performance Monitoring, a new report from EMA and Catchpoint ...

April 10, 2024

Choosing the right approach is critical with cloud monitoring in hybrid environments. Otherwise, you may drive up costs with features you don’t need and risk diminishing the visibility of your on-premises IT ...