Skip to main content

What Can AIOps Do For IT Ops? - Part 6

APMdigest asked the top minds in the industry what they think AIOps can do for IT Operations. Part 6 is the final installment in the series.

Start with What Can AIOps Do For IT Ops? - Part 1

Start with What Can AIOps Do For IT Ops? - Part 2

Start with What Can AIOps Do For IT Ops? - Part 3

Start with What Can AIOps Do For IT Ops? - Part 4

Start with What Can AIOps Do For IT Ops? - Part 5

SCALABILITY

AIOps advantages can be summed up in one word — scalability. A main advantage of AIOps within DevOps teams is the ability to scale a business with new technology, without having to scale the operations of new services in kind. AIOps allows DevOps teams to focus on innovating and improving the customer experience — the driving force of profitability — not on the constant pressure of monitoring and operating these services. Forward thinking DevOps teams need to be looking at AIOps and machine learning as mission critical to deliver higher availability of services."
Sean McDermott
CEO, Windward Consulting Group

Over the years, there's been a change in the ratio of people managing computers to the number of computers. In the 60s and 70s, there were many operators per machine. With the cloud, one admin manages thousands, possibly hundreds of thousands, of computers. The only way that's been managed has been through improvements in tooling. AIOps is the latest improvement in tooling and enables IT staff to work effectively with huge clusters that dynamically change. No human could possibly watch all the log files looking for anomalies and no simple set of Perl or Python scripts could automate that process. The only way to do this is to use AI to analyze the data being thrown off by huge clusters of computing resources, look for anomalies, and if possible, correct problems without requiring human involvement. For example, AI could detect signatures of failing devices, like disk drives, then move the data from the failing drive to a spare and notify a human to swap in a replacement. An AI system coupled with load balancing hardware could also make predictions about what your traffic will be and allocate resources accordingly. This is especially valuable in the cloud, where admins can allocate and release computing power as needed.
Mike Loukides
VP of Emerging Tech Content, O'Reilly Media

OPTIMIZING VALUE STREAMS

AIOps allows IT Operations to focus more on creating value stream optimization
Muraleedharan Vijayakumar
Senior Technical Manager, GAVS Technologies

The conversation on domain-agnostic versus domain-specific does not really matter. In the past, the domain-agnostic AIOps tools heavily rely on integrations with many different sources to collect data. Domain-centric AIOps tools typically collect most of the required data themselves and sometimes can be more specific to special domains, such as log management or specific application topics such as ERP. What this means: I believe Artificial Intelligence will and should be used across many domains and the current task for IT enterprises is to determine where they want to leverage AI capabilities to gain insights and reduce waste and toil. When analyzing the vendors in this space I found that some vendors tout their AI capabilities specifically for IT operations, others have and are adding additional data analytics and intelligent integrations to support evolving operating models. I think the next normal will require the leverage of AI across the value streams to successfully execute and delivery quality digital services and applications to customers.
Eveline Oehrlich
Chief Research Officer, DevOps Institute

ENABLING SMALLER TEAMS TO BE MORE EFFECTIVE

AIOps enables a small traditional IT Ops team to be much more effective and expand its reach. It can cover a much wider remit, including more systems to deploy, more geographies, and more variants (support AB testing).
Gareth Smith
GM of Eggplant, part of Keysight Technologies

DRIFT TRACKING

Drift tracking from inception to current production state has been a desired state in Enterprise for decades. AIOps can provide operations with a view into the Drift of changes from what was initially deployed to how the environment has changed over time. Understanding Drift is critical to reduce tech debt, incidents and problems across clients to cloud.
Jeanne Morain
Author, Strategist and Transformation Pioneer, iSpeak Cloud

DE-RISK ROLLOUT OF NEW INITIATIVES

AIOps can be the "extra pair of hands" to help identify problems and issues before they happen and from complex and varied data sets that would be difficult for a human to comprehend. This helps de-risk the rollout of new initiatives as issues are quickly identified and, if necessary, remediated or rolled back all quicker than a human can react.
Gareth Smith
GM of Eggplant, part of Keysight Technologies

FOCUS ON MORE STRATEGIC INITIATIVES

AIOps can also classify common issues allowing the Ops team to focus its time and effort on more strategic initiatives for greater efficiencies and benefits.
Gareth Smith
GM of Eggplant, part of Keysight Technologies

LABOR SAVINGS

With AIOps, when you multiply that reduction in fruitless labor cost by the number of applications and infrastructure assets that could generate alerts, multiplied by the amount of times groups in DevOps and IT were handing off issues to each other, a significant labor savings is at stake, as well as a higher rate of employee retention.
Jason English
Principal Analyst, Intellyx

COST EFFICIENCY

AIOps can allow IT organizations to operate efficiently and provide more reliable, scalable infrastructure for their users. With the vast amount of data available today, AIOps allows IT organizations to easily understand things like resource constraints, traffic patterns and automate / scale infrastructure more efficiently. Things that would take a human a lot of time to automate.
Saro Subbiah
VP of Engineering and Technology for Monitor & Platform, Sysdig

Hot Topics

The Latest

64% of enterprise networking teams use internally developed software or scripts for network automation, but 61% of those teams spend six or more hours per week debugging and maintaining them, according to From Scripts to Platforms: Why Homegrown Tools Dominate Network Automation and How Vendors Can Help, my latest EMA report ...

Cloud computing has transformed how we build and scale software, but it has also quietly introduced one of the most persistent challenges in modern IT: cost visibility and control ... So why, after more than a decade of cloud adoption, are cloud costs still spiraling out of control? The answer lies not in tooling but in culture ...

CEOs are committed to advancing AI solutions across their organization even as they face challenges from accelerating technology adoption, according to the IBM CEO Study. The survey revealed that executive respondents expect the growth rate of AI investments to more than double in the next two years, and 61% confirm they are actively adopting AI agents today and preparing to implement them at scale ...

Image
IBM

 

A major architectural shift is underway across enterprise networks, according to a new global study from Cisco. As AI assistants, agents, and data-driven workloads reshape how work gets done, they're creating faster, more dynamic, more latency-sensitive, and more complex network traffic. Combined with the ubiquity of connected devices, 24/7 uptime demands, and intensifying security threats, these shifts are driving infrastructure to adapt and evolve ...

Image
Cisco

The development of banking apps was supposed to provide users with convenience, control and piece of mind. However, for thousands of Halifax customers recently, a major mobile outage caused the exact opposite, leaving customers unable to check balances, or pay bills, sparking widespread frustration. This wasn't an isolated incident ... So why are these failures still happening? ...

Cyber threats are growing more sophisticated every day, and at their forefront are zero-day vulnerabilities. These elusive security gaps are exploited before a fix becomes available, making them among the most dangerous threats in today's digital landscape ... This guide will explore what these vulnerabilities are, how they work, why they pose such a significant threat, and how modern organizations can stay protected ...

The prevention of data center outages continues to be a strategic priority for data center owners and operators. Infrastructure equipment has improved, but the complexity of modern architectures and evolving external threats presents new risks that operators must actively manage, according to the Data Center Outage Analysis 2025 from Uptime Institute ...

As observability engineers, we navigate a sea of telemetry daily. We instrument our applications, configure collectors, and build dashboards, all in pursuit of understanding our complex distributed systems. Yet, amidst this flood of data, a critical question often remains unspoken, or at best, answered by gut feeling: "Is our telemetry actually good?" ... We're inviting you to participate in shaping a foundational element for better observability: the Instrumentation Score ...

We're inching ever closer toward a long-held goal: technology infrastructure that is so automated that it can protect itself. But as IT leaders aggressively employ automation across our enterprises, we need to continuously reassess what AI is ready to manage autonomously and what can not yet be trusted to algorithms ...

Much like a traditional factory turns raw materials into finished products, the AI factory turns vast datasets into actionable business outcomes through advanced models, inferences, and automation. From the earliest data inputs to the final token output, this process must be reliable, repeatable, and scalable. That requires industrializing the way AI is developed, deployed, and managed ...

What Can AIOps Do For IT Ops? - Part 6

APMdigest asked the top minds in the industry what they think AIOps can do for IT Operations. Part 6 is the final installment in the series.

Start with What Can AIOps Do For IT Ops? - Part 1

Start with What Can AIOps Do For IT Ops? - Part 2

Start with What Can AIOps Do For IT Ops? - Part 3

Start with What Can AIOps Do For IT Ops? - Part 4

Start with What Can AIOps Do For IT Ops? - Part 5

SCALABILITY

AIOps advantages can be summed up in one word — scalability. A main advantage of AIOps within DevOps teams is the ability to scale a business with new technology, without having to scale the operations of new services in kind. AIOps allows DevOps teams to focus on innovating and improving the customer experience — the driving force of profitability — not on the constant pressure of monitoring and operating these services. Forward thinking DevOps teams need to be looking at AIOps and machine learning as mission critical to deliver higher availability of services."
Sean McDermott
CEO, Windward Consulting Group

Over the years, there's been a change in the ratio of people managing computers to the number of computers. In the 60s and 70s, there were many operators per machine. With the cloud, one admin manages thousands, possibly hundreds of thousands, of computers. The only way that's been managed has been through improvements in tooling. AIOps is the latest improvement in tooling and enables IT staff to work effectively with huge clusters that dynamically change. No human could possibly watch all the log files looking for anomalies and no simple set of Perl or Python scripts could automate that process. The only way to do this is to use AI to analyze the data being thrown off by huge clusters of computing resources, look for anomalies, and if possible, correct problems without requiring human involvement. For example, AI could detect signatures of failing devices, like disk drives, then move the data from the failing drive to a spare and notify a human to swap in a replacement. An AI system coupled with load balancing hardware could also make predictions about what your traffic will be and allocate resources accordingly. This is especially valuable in the cloud, where admins can allocate and release computing power as needed.
Mike Loukides
VP of Emerging Tech Content, O'Reilly Media

OPTIMIZING VALUE STREAMS

AIOps allows IT Operations to focus more on creating value stream optimization
Muraleedharan Vijayakumar
Senior Technical Manager, GAVS Technologies

The conversation on domain-agnostic versus domain-specific does not really matter. In the past, the domain-agnostic AIOps tools heavily rely on integrations with many different sources to collect data. Domain-centric AIOps tools typically collect most of the required data themselves and sometimes can be more specific to special domains, such as log management or specific application topics such as ERP. What this means: I believe Artificial Intelligence will and should be used across many domains and the current task for IT enterprises is to determine where they want to leverage AI capabilities to gain insights and reduce waste and toil. When analyzing the vendors in this space I found that some vendors tout their AI capabilities specifically for IT operations, others have and are adding additional data analytics and intelligent integrations to support evolving operating models. I think the next normal will require the leverage of AI across the value streams to successfully execute and delivery quality digital services and applications to customers.
Eveline Oehrlich
Chief Research Officer, DevOps Institute

ENABLING SMALLER TEAMS TO BE MORE EFFECTIVE

AIOps enables a small traditional IT Ops team to be much more effective and expand its reach. It can cover a much wider remit, including more systems to deploy, more geographies, and more variants (support AB testing).
Gareth Smith
GM of Eggplant, part of Keysight Technologies

DRIFT TRACKING

Drift tracking from inception to current production state has been a desired state in Enterprise for decades. AIOps can provide operations with a view into the Drift of changes from what was initially deployed to how the environment has changed over time. Understanding Drift is critical to reduce tech debt, incidents and problems across clients to cloud.
Jeanne Morain
Author, Strategist and Transformation Pioneer, iSpeak Cloud

DE-RISK ROLLOUT OF NEW INITIATIVES

AIOps can be the "extra pair of hands" to help identify problems and issues before they happen and from complex and varied data sets that would be difficult for a human to comprehend. This helps de-risk the rollout of new initiatives as issues are quickly identified and, if necessary, remediated or rolled back all quicker than a human can react.
Gareth Smith
GM of Eggplant, part of Keysight Technologies

FOCUS ON MORE STRATEGIC INITIATIVES

AIOps can also classify common issues allowing the Ops team to focus its time and effort on more strategic initiatives for greater efficiencies and benefits.
Gareth Smith
GM of Eggplant, part of Keysight Technologies

LABOR SAVINGS

With AIOps, when you multiply that reduction in fruitless labor cost by the number of applications and infrastructure assets that could generate alerts, multiplied by the amount of times groups in DevOps and IT were handing off issues to each other, a significant labor savings is at stake, as well as a higher rate of employee retention.
Jason English
Principal Analyst, Intellyx

COST EFFICIENCY

AIOps can allow IT organizations to operate efficiently and provide more reliable, scalable infrastructure for their users. With the vast amount of data available today, AIOps allows IT organizations to easily understand things like resource constraints, traffic patterns and automate / scale infrastructure more efficiently. Things that would take a human a lot of time to automate.
Saro Subbiah
VP of Engineering and Technology for Monitor & Platform, Sysdig

Hot Topics

The Latest

64% of enterprise networking teams use internally developed software or scripts for network automation, but 61% of those teams spend six or more hours per week debugging and maintaining them, according to From Scripts to Platforms: Why Homegrown Tools Dominate Network Automation and How Vendors Can Help, my latest EMA report ...

Cloud computing has transformed how we build and scale software, but it has also quietly introduced one of the most persistent challenges in modern IT: cost visibility and control ... So why, after more than a decade of cloud adoption, are cloud costs still spiraling out of control? The answer lies not in tooling but in culture ...

CEOs are committed to advancing AI solutions across their organization even as they face challenges from accelerating technology adoption, according to the IBM CEO Study. The survey revealed that executive respondents expect the growth rate of AI investments to more than double in the next two years, and 61% confirm they are actively adopting AI agents today and preparing to implement them at scale ...

Image
IBM

 

A major architectural shift is underway across enterprise networks, according to a new global study from Cisco. As AI assistants, agents, and data-driven workloads reshape how work gets done, they're creating faster, more dynamic, more latency-sensitive, and more complex network traffic. Combined with the ubiquity of connected devices, 24/7 uptime demands, and intensifying security threats, these shifts are driving infrastructure to adapt and evolve ...

Image
Cisco

The development of banking apps was supposed to provide users with convenience, control and piece of mind. However, for thousands of Halifax customers recently, a major mobile outage caused the exact opposite, leaving customers unable to check balances, or pay bills, sparking widespread frustration. This wasn't an isolated incident ... So why are these failures still happening? ...

Cyber threats are growing more sophisticated every day, and at their forefront are zero-day vulnerabilities. These elusive security gaps are exploited before a fix becomes available, making them among the most dangerous threats in today's digital landscape ... This guide will explore what these vulnerabilities are, how they work, why they pose such a significant threat, and how modern organizations can stay protected ...

The prevention of data center outages continues to be a strategic priority for data center owners and operators. Infrastructure equipment has improved, but the complexity of modern architectures and evolving external threats presents new risks that operators must actively manage, according to the Data Center Outage Analysis 2025 from Uptime Institute ...

As observability engineers, we navigate a sea of telemetry daily. We instrument our applications, configure collectors, and build dashboards, all in pursuit of understanding our complex distributed systems. Yet, amidst this flood of data, a critical question often remains unspoken, or at best, answered by gut feeling: "Is our telemetry actually good?" ... We're inviting you to participate in shaping a foundational element for better observability: the Instrumentation Score ...

We're inching ever closer toward a long-held goal: technology infrastructure that is so automated that it can protect itself. But as IT leaders aggressively employ automation across our enterprises, we need to continuously reassess what AI is ready to manage autonomously and what can not yet be trusted to algorithms ...

Much like a traditional factory turns raw materials into finished products, the AI factory turns vast datasets into actionable business outcomes through advanced models, inferences, and automation. From the earliest data inputs to the final token output, this process must be reliable, repeatable, and scalable. That requires industrializing the way AI is developed, deployed, and managed ...