The term AIOps was first coined by Gartner in 2016, which means it has origins prior to that year, according to Thomas LaRock, Principal Developer Evangelist at Selector. Modern AIOps has been around for a decade, but the real origins of AIOps goes back fifty years to the mathematics behind process control charts.
Today, what we consider AIOps is nothing more than the application of statistical analysis to newer technologies, LaRock continues. Brilliant in the simplicity of finding a signal through the noise, AIOps is a mature model and rapidly augments traditional legacy monitoring platforms.
Start with: Discovering AIOps - Part 1
Start with: Discovering AIOps - Part 3: The Users
Start with: Discovering AIOps - Part 4: Advantages
Start with: Discovering AIOps - Part 5: More Advantages
Start with: Discovering AIOps - Part 6: Challenges
In Part 7 of this blog series, the experts talk about the current state of AIOps technology.
Only the Beginning
APMdigest asked the experts: How mature is AIOps?
"I'd describe AIOps as pre-adolescent," Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at Enterprise Management Associates (EMA), responds. "There are vendors doing cool things with anomaly detection, event correlation, and problem isolation, but most of the stuff I see is still immature."
"In many ways, AIOps is still in a nascent state," Heath Newburn, Distinguished Field Engineer at PagerDuty, agrees. "While core event management has been around for 30+ years, the ability to leverage AI/ML in ways to create context is rapidly evolving, and automation is definitely underutilized in most of the systems, but we are starting to see rapid growth. One can see that Generative AI can potentially have large impacts in building off of the power of AIOps in the coming years."
"While it's unquestionably early days for AIOps, the benefits are already pretty profound," says Asaf Yigal, CTO of Logz.io. "We're seeing performance improvements that can reduce MTTR from hours to minutes, and that stat alone is enough to be encouraged about the possibilities. But in its current state of maturity, this is for the ability of the system to provide a human user with the right information, context and/or capabilities to do their jobs smarter and faster."
Room for Growth
"Obviously, many of the components that make up AIOps have been around longer — machine learning, infrastructure-as-code, cloud computing, etc. — so in some respects (or if taken as the sum of its parts) AIOps could seem like a relatively mature technology set. But, when you look at how difficult it can actually be to employ AIOps, it's clear that it still needs to mature or gel quite a bit," says Camden Swita, Senior Product Manager at New Relic.
"Just because a business has access to big data, uses ML to some degree, and automates some of its IT functions does not mean it has adopted AIOps. Instead, it's the connective tissue between those key pieces that allows AIOps to be truly transformative, and almost magical," Swita continues.
"Every large APM or observability solution leverages some version of an AIOps feature," explains Phillip Carter, Principal Product Manager at Honeycomb. "A rule of thumb is that if an observability vendor has over 500M ARR, they have at least one AIOps feature. Companies operating at that large scale are more likely to have customers interested in utilizing an AIOps feature. This is often due to immature practices in generating and organizing telemetry at scale; therefore, they offer an AIOps feature to help. However, almost none of these companies hinge their business entirely on AIOps features; It's always presented as an add-on."
"Most platforms or solutions that sell on the basis of providing AIOps typically only provide parts — let's say they help the customer collect and organize relevant data sets and use ML to highlight issues, but don't have good integrations with cloud infrastructure solutions to automate functions," says Swita from New Relic. "This partial AIOps implementation is where most providers land today. That said, we're seeing many AIOps solutions make great strides toward completing the loop to provide a more complete AIOps implementation."
While much progress has been made in uses cases such as event correlation the true potential of AIOPs has yet to be reached, Payal Kindiger, Senior Director of Product Marketing at Riverbed cautions. Over the last 15 years, IT Ops tools emerged from basic monitoring to specialized performance measurement and data consolidation. IT teams faced excessive alerts due to tool overlap, leading to reactive firefighting. Early AIOps attempts varied from lightweight ML to rules-based approaches, such as those requiring an underlying CMDB/network topology. However, systems needing explicit rules lack true learning. Now, rule-based methods risk becoming outdated like Expert Systems. To be intelligent, an AIOps system must show self-learning, flexibility, and adaptability. Cognitive flexibility, causal inference-based action, and domain-specific learning are vital for true intelligence.
"While AIOps is mature and established as a concept, and its future is exciting, AIOps still has room to grow with implementation and adoption," Charles Burnham, Director, AIOps Engineering at LogicMonitor, concludes. "Companies are just starting to add AIOps within their tech stacks, and the next frontier is really understanding its value and what it can help organizations accomplish."
Several of the experts, including the analysts, point out recent noticeable AIOps progress.
"The AIOps technologies are maturing very quickly," Carlos Casanova, Principal Analyst at Forrester Research maintains. "The advances in just the last 18 months have been rather impressive. Leaders in both my AIOps Waves demonstrated great capabilities that enterprises can leverage on day one after the system is online."
"In our upcoming AIOps Radar, we should have a fresh look, but based on what we see now, AIOps has meaningfully progressed in terms of cloud adaptability, data access and deployment flexibility, and use cases for integrated automation," says Dennis Drogseth, VP at Enterprise Management Associates (EMA).
Scott Likens, Global AI and Innovation Technology Leader at PwC adds that AIOps platforms have advanced in terms of integration with various IT tools and systems, and ability to handle large volumes of data in real time.
"I think we're seeing a lot of progress around using both supervised and unsupervised models to rapidly enrich the data we use to make decisions with far deeper and more nuanced intelligence. Integration of LLMs with primary IT systems such as observability and security management platforms has accelerated this process significantly in the last 12 months alone," observes Yigal from Logz.io.
AIOps is a rapidly evolving field that has been gaining traction over the past few years, says Bharani Kumar Kulasekaran, Product Manager at ManageEngine. While the concept has been around for some time, its maturity varies across organizations and industries. Many enterprises have embraced AIOps concepts and implemented initial solutions, but a universal, worldwide level of maturity is yet to be achieved. On the other hand, some organizations still struggle with the concept of AIOps and understanding the value and benefits it can provide to their IT operations. Despite these challenges, AIOps is advancing quickly, and as more organizations invest in its capabilities, its maturity is expected to increase steadily.
AIOps Today: Providing Historical Insight
AIOps can be found in most leading commercial APM (Application Performance Management) tools today and is proliferating to other types of observability, says Yigal from Logz.io. In addition to using AI to accelerate investigation and troubleshooting, we are also seeing growing interest in bringing AIOps capabilities to bear in other key areas.
One example is using AI to filter through the data sent to an observability platform to understand what data is really needed, and what data can likely be dropped. This is essentially using the supervised and unsupervised models to de-prioritize large blocks of data that are clearly not helping with signal-to-noise.
This has a twofold impact, both in targeting analysis on the issues that matter most, and on saving money per the amount of data that you need to maintain in the platform, Yigal continues. But that approach is still fairly nascent, even as we drive it forward as a more accepted product capability. However, the better AIOps systems become at recommending the right data for human analysts to focus on, the more valuable those systems become in helping to optimize our most valuable resource — the time and effort of available human experts.
Around data optimization and workflow prioritization (think "which alerts should I focus my time and energies on"), there's no doubt that most organizations are using some form of AI in their work, and this is the current state of AIOps, Yigal says.
I think it's pretty safe to say that all of it is based on historic insight at this point, however, Yigal concludes. We know this about LLMs — they are only as current as the data they see. What we have not seen as much yet is organizations using AI to make proactive decisions. In this sense it's still in the research and enrichment phase. But that's the future state. When AI is making the proactive decisions and making adjustments without the need for as much human intervention, then we've really arrived at AIOps.
The Trust Issue
APMdigest asked the experts if the issue of trust, faced by AI technologies in general, will ultimately hinder the expansion of AIOps.
"AI in the news has centered recently on generative AI, which is just recently becoming a part of AIOps options," answers Dennis Drogseth from EMA. "But in the broader AIOps context, trust remains a factor as well — especially when AIOps is combined effectively with automation, which requires process and sometimes even role changes across IT. EMA's research sees a broader requirement for organizational evolution away from siloed ways of working and thinking."
"There's a mindset shift that must happen within IT Operations if they are to realize the AI advantage," Burnham from LogicMonitor concurs. "Migrating to an AIOps platform is not a lift and shift; in order to realize efficiencies it requires process change and trust in the tooling."
"While I think trust in AI is an issue that is universal for all forms of AI, I think a better question is: Should AIOps face a trust issue similar to AI? And I think the answer is: No," Bill Lobig, VP Product Management of Automation at IBM, asserts. "AIOps tools are not magic, they are math and statistics applied to data to provide the user with the probability something will occur with some degree of confidence."
Lobig goes on the explain that AIOps doesn't face AI-related trust issues for a couple reasons. First, because it's working with machine data, log data, and technical data, AIOps is not at risk of using bias data. Second, because AIOps still operates on the "trust, but verify" method. Machines can alert the human to a potential problem and suggest a next best action, but it is still up to the human to decide whether, and how, to act."
Carlos Casanova from Forrester concludes, "A lack of understanding and fear of the unknown is holding some enterprises back but this seems to be changing fast as enterprises see the tremendous value they can reap from AIOps technology implementations. The lack of trust is mostly in the last step of the equation where you have to decide if you trust the tool/automation to execute on your behalf without a human 'pressing the button.' There aren't a lot of trust issues elsewhere in the string of capabilities. As an enterprise gains trust in the AI and the data science in the technology, they go all-in and see amazing gains."
Industry experts offer thoughtful, insightful, and often controversial predictions on how APM, AIOps, Observability, OpenTelemetry and related technologies will evolve and impact business in 2024. Part 2 covers more on Observability ...
The Holiday Season means it is time for APMdigest's annual list of Application Performance Management (APM) predictions, covering IT performance topics. Industry experts — from analysts and consultants to the top vendors — offer thoughtful, insightful, and often controversial predictions on how APM, observability, AIOps and related technologies will evolve and impact business in 2024. Part 1 covers APM and Observability ...
To help you stay on top of the ever-evolving tech scene, Automox IT experts shake the proverbial magic eight ball and share their predictions about tech trends in the coming year. From M&A frenzies to sustainable tech and automation, these forecasts paint an exciting picture of the future ...
Incident management processes are not keeping pace with the demands of modern operations teams, failing to meet the needs of SREs as well as platform and ops teams. Results from the State of DevOps Automation and AI Survey, commissioned by Transposit, point to an incident management paradox. Despite nearly 60% of ITOps and DevOps professionals reporting they have a defined incident management process that's fully documented in one place and over 70% saying they have a level of automation that meets their needs, teams are unable to quickly resolve incidents ...