Bringing Alert Management into the Present with Advanced Analytics
March 25, 2015

Kevin Conklin
Ipswitch

Share this

We have smart cars on the horizon that will navigate themselves. Mobile apps that make communication, navigation and entertainment an integral part of our daily lives. Your insurance pricing may soon be affected by whether or not you wear a personal health monitoring device. Everywhere you turn, the very latest IT technologies are being leveraged to provide advanced services that were unimaginable even ten years ago. So why is it that the IT environments that provide these services are managed using an analytics technology designed for the 1970s?

The IT landscape has evolved significantly over the past few decades. IT management simply has not kept pace. IT operations teams are anxious that too many problems are reported first by end users. Support teams worry that too many people spend too much time troubleshooting. Over 70 percent of troubleshooting time is actually wasted following false hunches because alerts provide no value to the diagnostic process. Enterprises that are still reliant on yesterday’s management strategies will find it increasingly difficult to solve today’s operations and performance management challenges.

This is not just an issue of falling behind a technology curve. There is a real business impact in increasing incident rates, failing to detect potentially disastrous outages and human resources wasting valuable time. An increasing number of IT shops are anxiously searching for alternatives.

This is where advanced machine learning analytics can help.

Too often operations teams can become engulfed by alerts – getting tens of thousands a day and not knowing which to deal with and when, making it quite possible that something important was ignored while time was wasted on something trivial. Through a powerful combination of machine learning and anomaly detection, advanced analytics can reduce the alarms to a prioritized set that have the largest impact on the environment. By learning which alerts are “normal”, these systems define an operable status quo. In essence, machine learning filters out the “background noise” of alerts that, based on their persistence, have no effect on normal operations. From there, statistical algorithms identify and rank “abnormal” outliers on a scale measuring severity (value of a spike or drop occurrence), rarity (number of previous instances) or impact (quantity of related anomalies). The result is a reduction from hundreds of thousands of noisy alerts a week to a few dozen notifications of real problems.

Despite producing huge volumes of alerts, rules and thresholds implementations often miss problems or report them long after the customer has experienced the impact. The fear of generating even more alerts forces monitoring teams to select fewer KPIs, thus decreasing the likelihood of detection. Problems that slowly approach thresholds go unnoticed until user experience is already impacted. Adopting this advanced analytics approach empowers enterprises to not only identify problems that rules and thresholds miss or simply execute against too late, but also provide their troubleshooting teams with pre-correlated causal data.

By replacing legacy rules and thresholds with machine learning anomaly detection, IT teams can monitor larger sets of performance data in real-time. Monitoring more KPIs enable a higher percentage of issues to be detected before the users report them. Through real-time cross correlation, related anomalies are detected and alerts become more actionable. Early adopters report that they are able to reduce troubleshooting time by 75 percent, with commensurate reductions in the number of people involved by as much as 85 percent.

Advanced machine learning systems will fundamentally change the way data is converted into information over the next few years. If your business is leveraging information to provide competitive services, you can’t afford to be the laggard.

Kevin Conklin is VP of Product Marketing at Ipswitch
Share this

The Latest

July 22, 2019

Many organizations are unsure where to begin with AIOps, but should seriously consider adopting an AIOps strategy and solution. To get started, it's important to identify the key capabilities of AIOps that are needed to realize maximum value from your investments ...

July 18, 2019

Organizations that are working with artificial intelligence (AI) or machine learning (ML) have, on average, four AI/ML projects in place, according to a recent survey by Gartner, Inc. Of all respondents, 59% said they have AI deployed today ...

July 17, 2019

The 11th anniversary of the Apple App Store frames a momentous time period in how we interact with each other and the services upon which we have come to rely. Even so, we continue to have our in-app mobile experiences marred by poor performance and instability. Apple has done little to help, and other tools provide little to no visibility and benchmarks on which to prioritize our efforts outside of crashes ...

July 16, 2019

Confidence in artificial intelligence (AI) and its ability to enhance network operations is high, but only if the issue of bias is tackled. Service providers (68%) are most concerned about the bias impact of "bad or incomplete data sets," since effective AI requires clean, high quality, unbiased data, according to a new survey of communication service providers ...

July 15, 2019

Every internet connected network needs a visibility platform for traffic monitoring, information security and infrastructure security. To accomplish this, most enterprise networks utilize from four to seven specialized tools on network links in order to monitor, capture and analyze traffic. Connecting tools to live links with TAPs allow network managers to safely see, analyze and protect traffic without compromising network reliability. However, like most networking equipment it's critical that installation and configuration are done properly ...

July 11, 2019

The Democratic presidential debates are likely to have many people switching back-and-forth between live streams over the coming months. This is going to be especially true in the days before and after each debate, which will mean many office networks are likely to see a greater share of their total capacity going to streaming news services than ever before ...

July 10, 2019

Monitoring of heating, ventilation and air conditioning (HVAC) infrastructures has become a key concern over the last several years. Modern versions of these systems need continual monitoring to stay energy efficient and deliver satisfactory comfort to building occupants. This is because there are a large number of environmental sensors and motorized control systems within HVAC systems. Proper monitoring helps maintain a consistent temperature to reduce energy and maintenance costs for this type of infrastructure ...

July 09, 2019

Shoppers won’t wait for retailers, according to a new research report titled, 2019 Retailer Website Performance Evaluation: Are Retail Websites Meeting Shopper Expectations? from Yottaa ...

June 27, 2019

Customer satisfaction and retention were the top concerns for a majority (58%) of IT leaders when suffering downtime or outages, according to a survey of top IT leaders conducted by AIOps Exchange. The effect of service interruptions on customers outweighed other concerns such as loss of revenue, brand reputation, negative press coverage, or the impact on IT Ops teams.

June 26, 2019

It is inevitable that employee productivity and the quality of customer experiences suffer as a consequence of the poor performance of O365. The quick detection and rapid resolution of problems associated with O365 are top of mind for any organization to keep its business humming ...