What Can AIOps Do For IT Ops? - Part 4
October 28, 2021
Share this

APMdigest asked the top minds in the industry what they think AIOps can do for IT Operations. Part 4 covers root cause analysis and automation.

Start with What Can AIOps Do For IT Ops? - Part 1

Start with What Can AIOps Do For IT Ops? - Part 2

Start with What Can AIOps Do For IT Ops? - Part 3

SINGLE PANE OF GLASS

AIOps provides a much needed real-time "single-pane-of-glass" view into complex IT infrastructures that encompass fragmented and distributed multi-vendor, multi-domain technologies including legacy, virtualization, hybrid cloud, containers, microservices, and others. Although AIOps is a seismic change for IT operations, it's not a radical application of analytics and machine learning. The potential of AIOps is enormous. Enterprises that have deployed AIOps solutions are experiencing transformational benefits in revenue growth, better customer retention, improved customer experience, lower costs, and enhanced performance. The time to move is now.
Maruti Sivakumar V
SVP, Head of Digital & Practices, Blue.cloud

ISOLATING THE ROOT CAUSE

AIOps helps build high-quality incidents that include all the necessary technical and business context, alongside AI/ML-identified probable root cause and root cause changes — and present it all within a single pane of glass.
Mohan Kompella, VP Product Marketing,
Adam Blau, Director of Product Marketing,
Anirban Chatterjee, Director of Product Marketing, BigPanda

AIOps is a buzzword 6 different types of products designed to create value for IT Operations professionals. Always pick specific use cases you wish to solve and then understand how machine learning and AI can apply to solve that issue or set of issues. Good examples of this are to help the user isolate the root cause down to a specific component, highlight outliers in graphs and other views, correlate likely related data types together. Generally, these technologies help augment the operator of the software versus being automation magic. Most often these are features in other Observability tools versus AIOps platforms. AIOps platforms are fantasy because the semantic meaning of data is not clear. The result is vendors write rules to analyze the data, making the resulted outcomes only work in specific situations which makes them useless when a major problem happens across a set of complex systems.
Jonah Kowall
CTO, Logz.io

AUTOMATED ROOT CAUSE ANALYSIS

Response automation is one of the most value-driving features of AIOps software tools. IT operators are able to conduct performance tests to establish a baseline for each metric or KPI and define acceptable thresholds for the ones they want to prioritize. When a KPI breach is detected, AIOps software can perform an automated root cause analysis to automatically determine why a problem occurred and implement a solution if one is available.
Abel Gonzalez
Director of Product Marketing, Sumo Logic

Machine learning and AI are not just critical — but foundational — components of a dynamic monitoring platform. Modern applications are constantly in flux, and microservices scale through ephemeral cloud and container infrastructure in response to demand. As these systems become more complex and dynamic, operational tasks consume an increasing share of engineering time. AIOps optimizes and automates IT operations so that engineers can get proactively alerted no matter the size of the workloads, and benefit from an augmented troubleshooting experience by cutting through noise to glean key insights. In some cases, AI can auto-discover the root cause of an issue, saving minutes or hours of stressful investigations. This is the core advantage of effective AIOps — less engineering time wasted on managing complex operations, and more time building new products for customers.
Renaud Boutet
VP of Product, Datadog

BETTER DECISION-MAKING

From a monitoring and observability perspective, a key benefit of AIOps has been the ability to use historical data to increase confidence in decisions that we previously thought were black-and-white. It's relatively simple to have a machine check if a service is up or down, but how do we find the trends that show that whilst the website is up, it's gradually been getting slower over the past few months? Modern tooling allows us to collect enough data and process it fast enough — often in real-time — for the machines to be able to make better-informed decisions, faster. Such decisions could only be made by lengthy human inspection previously. It's a great example of modern tooling working in the background to make sure everything is okay, so we don't have to.
Matt Saunders
Head of DevOps, Adaptavist

AIOps observability can play a critical role in terms of expected trends using the data from users, systems and processes and provide the data back to the decision-makers to make the investment call based on the pattern, trends, etc. With growing Cloud demand, it is imperative the enterprises start investing in AIOps before it is too late.
Vishnu Vasudevan
Head of Product Engineering and Management, Opsera

SYNCING WITH ITSM

Create automated, bi-directional syncing with your ITSM platform, on-call or other collaboration tools and reduce ticket/notification volumes by up to 95%
Mohan Kompella, VP Product Marketing,
Adam Blau, Director of Product Marketing,
Anirban Chatterjee, Director of Product Marketing, BigPanda

First generation AIOps solutions are a step in the right direction, to address the unending IT complexity, but needed more care and feed and only solved limited set of problems for ITOps teams. Looking ahead, new age AIOps platforms are poised to make AIOps faster, better and cheaper — by automating data preparations and integrations, by having native asset/topology intelligence and by using expanded AI/ML frameworks like neural networks, NLP, transformer models and graph databases to address a lot more use cases. This paves a path where everybody in the IT benefits — ITSM, Service Desk, IT Asset/Planning and more.
Tejo Prayaga
Product Management, CloudFabrix

UNDERSTANDING ALGORITHMS

The last several years have seen a dramatic increase in the use of AI across all types of companies and platforms. These complex solutions require more parts of an organization to be knowledgeable of AI, from data pipelines to the workflows that build, qualify and optimize the models. Having a specialized Ops function that understands this end-to-end is going to be critical for maximizing AI's effectiveness in a production environment. Over time, AIOps can build a deeper understanding of the algorithms, then use that knowledge to enhance the infrastructure with automated services around data cleaning, model tuning and scaling that will continue delivering key results for the business. This kind of specialty is beyond what a traditional IT Operations team can do with the breadth that they are normally expected to maintain.
David Luks
VP of Engineering, Smart Applications, Lucidworks

AUTOMATION

AIOps delivers significant value to businesses by automating many of the manual, tedious tasks that distract IT from working on higher level projects, especially when it comes to data prep.
David P. Mariani
CTO and Founder, AtScale

As the cadence of business continues to gain momentum and competition builds, organizations must not only innovate but also identify business problems and inefficiencies and utilize technology to overcome them. AIOps acts as the salve for many enterprise challenges by anchoring a triangulation of machine learning, decision automation and advanced analytics to automate repetitive tasks, freeing IT teams to work on new mission critical and challenging problems — resulting in faster completion of projects and improved business outcomes.
Alan Young
CPO, InRule

REMEDIAL OPTIMIZATION

IT Operations cannot keep up with the requirements of keeping cloud applications functional and running their best. IT Ops needs to utilize the power of AI to keep the many combinations of app parameters and metrics in an optimal state. Moreso, for AIOps to keep operational apps optimized it needs to be continuous (always on) and autonomous (no human intervention). This way AIOps can perform the remedial optimization work the IT Ops SREs would do, but much faster and with more accuracy.
Peter Nickolov
Co-Founder and VP of Engineering, Opsani

Go to What Can AIOps Do For IT Ops? - Part 5

Share this

The Latest

October 05, 2022

IT operations is a metrics-driven function and teams should keep score as a core practice. Services and sub-services break, alerts of varying quality come in, incidents are created, and services get fixed. Analytics can help IT teams improve these operations ...

October 04, 2022

Big Data makes it possible to bring data from all the monitoring and reporting tools together, both for more effective analysis and a simplified single-pane view for the user. IT teams gain a holistic picture of system performance. Doing this makes sense because the system's components interact, and issues in one area affect another ...

October 03, 2022

IT engineers and executives are responsible for system reliability and availability. The volume of data can make it hard to be proactive and fix issues quickly. With over a decade of experience in the field, I know the importance of IT operations analytics and how it can help identify incidents and enable agile responses ...

September 30, 2022

For businesses with vast and distributed computing infrastructures, one of the main objectives of IT and network operations is to locate the cause of a service condition that is having an impact. The more human resources are put into the task of gathering, processing, and finally visual monitoring the massive volumes of event and log data that serve as the main source of symptomatic indications for emerging crises, the closer the service is to the company's source of revenue ...

September 29, 2022

Our digital economy is intolerant of downtime. But consumers haven't just come to expect always-on digital apps and services. They also expect continuous innovation, new functionality and lightening fast response times. Organizations have taken note, investing heavily in teams and tools that supposedly increase uptime and free resources for innovation. But leaders have not realized this "throw money at the problem" approach to monitoring is burning through resources without much improvement in availability outcomes ...

September 28, 2022

Although 83% of businesses are concerned about a recession in 2023, B2B tech marketers can look forward to growth — 51% of organizations plan to increase IT budgets in 2023 vs. a narrow 6% that plan to reduce their spend, according to the 2023 State of IT report from Spiceworks Ziff Davis ...

September 27, 2022

Users have high expectations around applications — quick loading times, look and feel visually advanced, with feature-rich content, video streaming, and multimedia capabilities — all of these devour network bandwidth. With millions of users accessing applications and mobile apps from multiple devices, most companies today generate seemingly unmanageable volumes of data and traffic on their networks ...

September 26, 2022

In Italy, it is customary to treat wine as part of the meal ... Too often, testing is treated with the same reverence as the post-meal task of loading the dishwasher, when it should be treated like an elegant wine pairing ...

September 23, 2022

In order to properly sort through all monitoring noise and identify true problems, their causes, and to prioritize them for response by the IT team, they have created and built a revolutionary new system using a meta-cognitive model ...

September 22, 2022

As we shift further into a digital-first world, where having a reliable online experience becomes more essential, Site Reliability Engineers remain in-demand among organizations of all sizes ... This diverse set of skills and values can be difficult to interview for. In this blog, we'll get you started with some example questions and processes to find your ideal SRE ...