AI in IT Ops is No Longer Science Fiction
March 13, 2019

Mohan Kompella

Share this

If you’ve been working in the high-stress world of IT Ops and NOCs, even if it’s just for a year or two, you just know that some vendors’ claims are too good to be true, such as:

■ A 95% reduction in the amount of IT noise

■ A 99% reduction in ticket volume

■ A 99% L1 resolution rate

Now what if I said that those things are not only possible, but that some of the largest, most complex enterprises in the world see these metrics in their environments every day, thanks to Artificial Intelligence (AI) and Machine Learning (ML)? Would you dismiss that as belonging to the realm of science fiction? If you do, hold that thought while we dive into the world of IT Ops for a few minutes.

IT Ops and NOC Teams Are in a World of Pain

It’s a fact that today’s highly complex IT stack, coupled with legacy IT Ops tools built for a bygone era, are creating a perfect storm for IT Ops and NOC teams. These teams are overwhelmed by IT noise and they’re are not able to reliably detect, investigate and resolve incidents and outages.

What’s the result? Painful and prolonged outages; unhappy customers and angry service owners; and burnt-out IT Ops and NOC teams that are constantly putting out fire after fire.

The consequences on the business side are equally bad: escalating operating costs and profitability issues; poor performance and availability leading to customer satisfaction and retention problems; and finally the risk to critical digital initiatives and projects.

AI and ML Are No Longer Science Fiction … They Can Help

Given these problems, AI and ML can make a dramatic difference.


Consider, for a moment, the IT Ops problem as a data problem: There’s an overwhelming amount of data that needs to be processed in real-time, there are a handful of critical insights deep inside those very large, noisy datasets, and enterprises aren’t able to hire enough IT Ops and NOC personnel to scalably and cost-effectively deal with this problem.

Now, consider what makes AI and ML very good at tackling the “data problem”:

1. AI and ML are very good at parsing very large datasets in real-time.

2. AI and ML are very good at extracting insights from these large, noisy datasets.

3. AI and ML are very good at learning from the data that they process so that they become better at extracting insights from future datasets.

It’s easy to see why AI and ML are a godsend for the world of IT Ops.

In more concrete terms, today, AI and ML-powered IT Ops tools can take their place right alongside human IT Ops and NOC teams and augment their capabilities. Together, they can face their daily deluges of noisy data and emerge victorious!

But businesses — which are understandably risk-averse and want to make sure they can see the ROI when they invest in new tools, and new technologies — want to know what’s different this time around with AI and ML, especially because AI and ML have been around in some form or the other since the 60s.

This time around, it’s different because of a confluence of factors:

1. The cost of compute and storage have dropped dramatically in the last couple of years.

2. Computing power has rapidly increased.

3. It’s become easier to access and harness that power today. Taken together, that means businesses, and consumers too, have access to AI and ML anywhere and anytime. In other words, AI and ML have been democratized.

So if you’ve been on the fence about adopting AI and ML in your IT Ops toolstack because of the hype and because of "too good to be true" metrics and results, it may be time to jump into the brave new world of AI and ML-powered IT Ops tools, and help your team experience some of those benefits for themselves.

A Note of Caution

But before you jump in, you, your team and your organization should keep three things in mind.

1. Time to value: Pragmatic IT and IT Ops teams understandably question the amount of time required to deploy AI and ML-based solutions. So the right tool should be deployable easily and quickly, without needing to be trained for months or years.

2. Building trust: Because machine learning has a “black box” reputation, enterprises in general, and IT Ops and NOC teams in particular, have a hard time trusting decisions made by ‘black box’ machine learning.

“It is a problem that is already relevant, and it’s going to be much more relevant in the future,” says Tommi Jaakkola, a professor at MIT who works on applications of machine learning. “Whether it’s an investment decision, a medical decision, or maybe a military decision, you don’t want to just rely on a ‘black box’ method.”

That’s why it’s important that enterprises choose tools whose AI and ML is an "Open Box." In other words, the AI and ML used in these tools must create logic that can be understand by human teams; this logic must be editable by human teams so they can incorporate their hard-won, real-world business and tribal knowledge; and they must be able to test this logic before deploying it into production.

3. Adoption and use: Just because your IT Ops tool is powered by AI and ML doesn’t mean that it should come with a steep learning curve. Au contraire, because your new tool uses technology that used to belong to the realm of science fiction, it should be as simple, non-intimidating and easy to learn and use as possible.

After all, at the end of the day, the more enthusiastically your IT Ops and NOC teams embrace your new tool, the more successful they will be, and the more successful your new investment will be.  

I hope this blog was interesting, and more importantly, useful. I hope that you’re able to help your IT Ops team and your organization realize the very real and very transformational benefits of AI and ML in the near future.

Mohan Kompella is VP of Product Marketing at BigPanda
Share this