Skip to main content

What Can AIOps Do For IT Ops? - Part 6

APMdigest asked the top minds in the industry what they think AIOps can do for IT Operations. Part 6 is the final installment in the series.

Start with What Can AIOps Do For IT Ops? - Part 1

Start with What Can AIOps Do For IT Ops? - Part 2

Start with What Can AIOps Do For IT Ops? - Part 3

Start with What Can AIOps Do For IT Ops? - Part 4

Start with What Can AIOps Do For IT Ops? - Part 5

SCALABILITY

AIOps advantages can be summed up in one word — scalability. A main advantage of AIOps within DevOps teams is the ability to scale a business with new technology, without having to scale the operations of new services in kind. AIOps allows DevOps teams to focus on innovating and improving the customer experience — the driving force of profitability — not on the constant pressure of monitoring and operating these services. Forward thinking DevOps teams need to be looking at AIOps and machine learning as mission critical to deliver higher availability of services."
Sean McDermott
CEO, Windward Consulting Group

Over the years, there's been a change in the ratio of people managing computers to the number of computers. In the 60s and 70s, there were many operators per machine. With the cloud, one admin manages thousands, possibly hundreds of thousands, of computers. The only way that's been managed has been through improvements in tooling. AIOps is the latest improvement in tooling and enables IT staff to work effectively with huge clusters that dynamically change. No human could possibly watch all the log files looking for anomalies and no simple set of Perl or Python scripts could automate that process. The only way to do this is to use AI to analyze the data being thrown off by huge clusters of computing resources, look for anomalies, and if possible, correct problems without requiring human involvement. For example, AI could detect signatures of failing devices, like disk drives, then move the data from the failing drive to a spare and notify a human to swap in a replacement. An AI system coupled with load balancing hardware could also make predictions about what your traffic will be and allocate resources accordingly. This is especially valuable in the cloud, where admins can allocate and release computing power as needed.
Mike Loukides
VP of Emerging Tech Content, O'Reilly Media

OPTIMIZING VALUE STREAMS

AIOps allows IT Operations to focus more on creating value stream optimization
Muraleedharan Vijayakumar
Senior Technical Manager, GAVS Technologies

The conversation on domain-agnostic versus domain-specific does not really matter. In the past, the domain-agnostic AIOps tools heavily rely on integrations with many different sources to collect data. Domain-centric AIOps tools typically collect most of the required data themselves and sometimes can be more specific to special domains, such as log management or specific application topics such as ERP. What this means: I believe Artificial Intelligence will and should be used across many domains and the current task for IT enterprises is to determine where they want to leverage AI capabilities to gain insights and reduce waste and toil. When analyzing the vendors in this space I found that some vendors tout their AI capabilities specifically for IT operations, others have and are adding additional data analytics and intelligent integrations to support evolving operating models. I think the next normal will require the leverage of AI across the value streams to successfully execute and delivery quality digital services and applications to customers.
Eveline Oehrlich
Chief Research Officer, DevOps Institute

ENABLING SMALLER TEAMS TO BE MORE EFFECTIVE

AIOps enables a small traditional IT Ops team to be much more effective and expand its reach. It can cover a much wider remit, including more systems to deploy, more geographies, and more variants (support AB testing).
Gareth Smith
GM of Eggplant, part of Keysight Technologies

DRIFT TRACKING

Drift tracking from inception to current production state has been a desired state in Enterprise for decades. AIOps can provide operations with a view into the Drift of changes from what was initially deployed to how the environment has changed over time. Understanding Drift is critical to reduce tech debt, incidents and problems across clients to cloud.
Jeanne Morain
Author, Strategist and Transformation Pioneer, iSpeak Cloud

DE-RISK ROLLOUT OF NEW INITIATIVES

AIOps can be the "extra pair of hands" to help identify problems and issues before they happen and from complex and varied data sets that would be difficult for a human to comprehend. This helps de-risk the rollout of new initiatives as issues are quickly identified and, if necessary, remediated or rolled back all quicker than a human can react.
Gareth Smith
GM of Eggplant, part of Keysight Technologies

FOCUS ON MORE STRATEGIC INITIATIVES

AIOps can also classify common issues allowing the Ops team to focus its time and effort on more strategic initiatives for greater efficiencies and benefits.
Gareth Smith
GM of Eggplant, part of Keysight Technologies

LABOR SAVINGS

With AIOps, when you multiply that reduction in fruitless labor cost by the number of applications and infrastructure assets that could generate alerts, multiplied by the amount of times groups in DevOps and IT were handing off issues to each other, a significant labor savings is at stake, as well as a higher rate of employee retention.
Jason English
Principal Analyst, Intellyx

COST EFFICIENCY

AIOps can allow IT organizations to operate efficiently and provide more reliable, scalable infrastructure for their users. With the vast amount of data available today, AIOps allows IT organizations to easily understand things like resource constraints, traffic patterns and automate / scale infrastructure more efficiently. Things that would take a human a lot of time to automate.
Saro Subbiah
VP of Engineering and Technology for Monitor & Platform, Sysdig

Hot Topics

The Latest

If AI is the engine of a modern organization, then data engineering is the road system beneath it. You can build the most powerful engine in the world, but without paved roads, traffic signals, and bridges that can support its weight, it will stall. In many enterprises, the engine is ready. The roads are not ...

In the world of digital-first business, there is no tolerance for service outages. Businesses know that outages are the quickest way to lose money and customers. For smaller organizations, unplanned downtime could even force the business to close ... A new study from PagerDuty, The State of AI-First Operations, reveals that companies actively incorporating AI into operations now view operational resilience as a growth driver rather than a cost center. But how are they achieving it? ...

In live financial environments, capital markets software cannot pause for rebuilds. New capabilities are introduced as stacked technology layers to meet evolving demands while systems remain active, data keeps moving, and controls stay intact. AI is no exception, and its opportunities are significant: accelerated decision cycles, compressed manual workflows, and more effective operations across complex environments. The constraint isn't the models themselves, but the architectural environments they enter ...

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue ...

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ... 

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

What Can AIOps Do For IT Ops? - Part 6

APMdigest asked the top minds in the industry what they think AIOps can do for IT Operations. Part 6 is the final installment in the series.

Start with What Can AIOps Do For IT Ops? - Part 1

Start with What Can AIOps Do For IT Ops? - Part 2

Start with What Can AIOps Do For IT Ops? - Part 3

Start with What Can AIOps Do For IT Ops? - Part 4

Start with What Can AIOps Do For IT Ops? - Part 5

SCALABILITY

AIOps advantages can be summed up in one word — scalability. A main advantage of AIOps within DevOps teams is the ability to scale a business with new technology, without having to scale the operations of new services in kind. AIOps allows DevOps teams to focus on innovating and improving the customer experience — the driving force of profitability — not on the constant pressure of monitoring and operating these services. Forward thinking DevOps teams need to be looking at AIOps and machine learning as mission critical to deliver higher availability of services."
Sean McDermott
CEO, Windward Consulting Group

Over the years, there's been a change in the ratio of people managing computers to the number of computers. In the 60s and 70s, there were many operators per machine. With the cloud, one admin manages thousands, possibly hundreds of thousands, of computers. The only way that's been managed has been through improvements in tooling. AIOps is the latest improvement in tooling and enables IT staff to work effectively with huge clusters that dynamically change. No human could possibly watch all the log files looking for anomalies and no simple set of Perl or Python scripts could automate that process. The only way to do this is to use AI to analyze the data being thrown off by huge clusters of computing resources, look for anomalies, and if possible, correct problems without requiring human involvement. For example, AI could detect signatures of failing devices, like disk drives, then move the data from the failing drive to a spare and notify a human to swap in a replacement. An AI system coupled with load balancing hardware could also make predictions about what your traffic will be and allocate resources accordingly. This is especially valuable in the cloud, where admins can allocate and release computing power as needed.
Mike Loukides
VP of Emerging Tech Content, O'Reilly Media

OPTIMIZING VALUE STREAMS

AIOps allows IT Operations to focus more on creating value stream optimization
Muraleedharan Vijayakumar
Senior Technical Manager, GAVS Technologies

The conversation on domain-agnostic versus domain-specific does not really matter. In the past, the domain-agnostic AIOps tools heavily rely on integrations with many different sources to collect data. Domain-centric AIOps tools typically collect most of the required data themselves and sometimes can be more specific to special domains, such as log management or specific application topics such as ERP. What this means: I believe Artificial Intelligence will and should be used across many domains and the current task for IT enterprises is to determine where they want to leverage AI capabilities to gain insights and reduce waste and toil. When analyzing the vendors in this space I found that some vendors tout their AI capabilities specifically for IT operations, others have and are adding additional data analytics and intelligent integrations to support evolving operating models. I think the next normal will require the leverage of AI across the value streams to successfully execute and delivery quality digital services and applications to customers.
Eveline Oehrlich
Chief Research Officer, DevOps Institute

ENABLING SMALLER TEAMS TO BE MORE EFFECTIVE

AIOps enables a small traditional IT Ops team to be much more effective and expand its reach. It can cover a much wider remit, including more systems to deploy, more geographies, and more variants (support AB testing).
Gareth Smith
GM of Eggplant, part of Keysight Technologies

DRIFT TRACKING

Drift tracking from inception to current production state has been a desired state in Enterprise for decades. AIOps can provide operations with a view into the Drift of changes from what was initially deployed to how the environment has changed over time. Understanding Drift is critical to reduce tech debt, incidents and problems across clients to cloud.
Jeanne Morain
Author, Strategist and Transformation Pioneer, iSpeak Cloud

DE-RISK ROLLOUT OF NEW INITIATIVES

AIOps can be the "extra pair of hands" to help identify problems and issues before they happen and from complex and varied data sets that would be difficult for a human to comprehend. This helps de-risk the rollout of new initiatives as issues are quickly identified and, if necessary, remediated or rolled back all quicker than a human can react.
Gareth Smith
GM of Eggplant, part of Keysight Technologies

FOCUS ON MORE STRATEGIC INITIATIVES

AIOps can also classify common issues allowing the Ops team to focus its time and effort on more strategic initiatives for greater efficiencies and benefits.
Gareth Smith
GM of Eggplant, part of Keysight Technologies

LABOR SAVINGS

With AIOps, when you multiply that reduction in fruitless labor cost by the number of applications and infrastructure assets that could generate alerts, multiplied by the amount of times groups in DevOps and IT were handing off issues to each other, a significant labor savings is at stake, as well as a higher rate of employee retention.
Jason English
Principal Analyst, Intellyx

COST EFFICIENCY

AIOps can allow IT organizations to operate efficiently and provide more reliable, scalable infrastructure for their users. With the vast amount of data available today, AIOps allows IT organizations to easily understand things like resource constraints, traffic patterns and automate / scale infrastructure more efficiently. Things that would take a human a lot of time to automate.
Saro Subbiah
VP of Engineering and Technology for Monitor & Platform, Sysdig

Hot Topics

The Latest

If AI is the engine of a modern organization, then data engineering is the road system beneath it. You can build the most powerful engine in the world, but without paved roads, traffic signals, and bridges that can support its weight, it will stall. In many enterprises, the engine is ready. The roads are not ...

In the world of digital-first business, there is no tolerance for service outages. Businesses know that outages are the quickest way to lose money and customers. For smaller organizations, unplanned downtime could even force the business to close ... A new study from PagerDuty, The State of AI-First Operations, reveals that companies actively incorporating AI into operations now view operational resilience as a growth driver rather than a cost center. But how are they achieving it? ...

In live financial environments, capital markets software cannot pause for rebuilds. New capabilities are introduced as stacked technology layers to meet evolving demands while systems remain active, data keeps moving, and controls stay intact. AI is no exception, and its opportunities are significant: accelerated decision cycles, compressed manual workflows, and more effective operations across complex environments. The constraint isn't the models themselves, but the architectural environments they enter ...

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue ...

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ... 

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...