Using Analytics to Detect Application Performance Anomalies
March 04, 2014
Charley Rich
Share this

IT organizations are under more pressure to deliver exceptional business performance than ever. Further complicating the challenge is the evolving nature of Information Technology (IT). The rise of Big Data, mobile, cloud, and BYOD have added complexity, making it ever more challenging for IT to acquire the visibility they need to detect anomalies.

Today, an organization’s application infrastructure typically includes Web components, messaging middleware and mainframes. Application performance is impacted by many factors coming from multiple sources—application servers, messaging protocols, virtualized systems, capacity issues and many more. Inevitably, failures in one or more of these systems occur — and IT is left to deal with the result.

Such situations are why Application Performance Management (APM) solutions exist. To be effective, APM must deliver three major benefits:

- Gain enough visibility to see an entire system

- Track activities through the infrastructure chain as they occur

- Correlate events—many of which might seem unrelated—in order to spot developing trends before users are impacted.

Surprisingly, a number of APM platforms miss on one or more of these key functions.

Monitoring is Not Enough

To be sure, most APM solutions do a good job of monitoring individual applications. But, monitoring is not enough. When problems arise, especially in today's complex topologies, the failure of a single application is rarely the culprit. Performance threats usually are the result of multiple issues — and many of these, if caught early in the process using real-time analytics, could prevent much larger failures from occurring. Evading cascading failures is essential. Ideally, IT Specialists should avoid being in the position of putting out fires — they should be able to make sure the fire never starts. But, without the necessary visibility, this is no simple task.

To properly manage today's application environment, organizations must be able to analyze the entire application chain from end to end, understanding the dependencies between the links in the chain. It must also be able to focus on early detection of abnormalities, differentiating symptom from cause rather than simply reacting to an outage. The combination of these two factors provides the level of assurance IT needs in its key mission: to reduce the frequency and duration of outages.

End-to-end performance monitoring and analysis must embrace the entire IT environment, from .NET to mainframes. It must cover a wide range of components from J2EE application servers, Web Services to middleware messaging, brokers and even legacy applications. It must also be elastic, having the ability to transparently scale to meet unexpected surges in demand.

Analyzing Situations with Complex Event Processing

Accomplishing the second requirement — proactive analytics, rather than reactive response — requires a sophisticated technology, one example being Complex Event Processing (CEP). CEP engines, along with business policies, analyze situations or "business views" comprised of multiple events and key performance indicators.

Instead of alerts based on individual events passing a threshold, the analytical approach is analyzing situations. It compares application behavior against your norms, looking for anomalies that indicate potential problems. Norms are established dynamically using statistical functions such as Bollinger bands, momentum oscillators, standard deviation, velocity, fluctuation and rates of change.

This approach ensures that real problems — not just transient variations, a.k.a. "false alarms" — are identified and ensures true readings of real-time performance.

With CEP-based analytics, IT Specialists are assisted in quickly identifying root causes, instead of merely chasing symptoms. By dynamically analyzing event streams, the CEP approach can differentiate symptoms from cause — even inferring an explanation where there is signal loss.

APM solutions using real-time anomaly detection have the ability to maintain SLAs in the most high-demand deployments including payments, EFT, trading, settlement, compliance patient data, claims processing and retail order management. They not only bring developing situations to the attention of IT staff before users are aware, but also assist in diagnosing and correcting the underlying causes quickly and efficiently.

In an era when business functions are more sophisticated, diverse, integrated and immediate than ever, analytical Application Performance Management plays an essential role for IT professionals and their customers.

Charley Rich is VP Product Management and Marketing at Nastel Technologies.

Share this

The Latest

November 19, 2019

Unexpected and unintentional drops in network quality, so-called network brownouts, cause serious financial damage and frustrate employees. A recent survey sponsored by Netrounds reveals that more than 60% of network brownouts are first discovered by IT’s internal and external customers, or never even reported, instead of being proactively detected by IT organizations ...

November 18, 2019

Digital transformation reaches into every aspect of our work and personal lives, to the point that there is an automatic expectation of 24/7, anywhere availability regarding any organization with an online presence. This environment is ripe for artificial intelligence, so it's no surprise that IT Operations has been an early adopter of AI ...

November 14, 2019

A brief introduction to Applications Performance Monitoring (APM), breaking it down to a few key points, followed by a few important lessons which I have learned over the years ...

November 13, 2019

Research conducted by ServiceNow shows that Gen Zs, now entering the workforce, recognize the promise of technology to improve work experiences, are eager to learn from other generations, and believe they can help older generations be more open‑minded ...

November 12, 2019

We're in the middle of a technology and connectivity revolution, giving us access to infinite digital tools and technologies. Is this multitude of technology solutions empowering us to do our best work, or getting in our way? ...

November 07, 2019

Microservices have become the go-to architectural standard in modern distributed systems. While there are plenty of tools and techniques to architect, manage, and automate the deployment of such distributed systems, issues during troubleshooting still happen at the individual service level, thereby prolonging the time taken to resolve an outage ...

November 06, 2019

A recent APMdigest blog by Jean Tunis provided an excellent background on Application Performance Monitoring (APM) and what it does. A further topic that I wanted to touch on though is the need for good quality data. If you are to get the most out of your APM solution possible, you will need to feed it with the best quality data ...

November 05, 2019

Humans and manual processes can no longer keep pace with network innovation, evolution, complexity, and change. That's why we're hearing more about self-driving networks, self-healing networks, intent-based networking, and other concepts. These approaches collectively belong to a growing focus area called AIOps, which aims to apply automation, AI and ML to support modern network operations ...

November 04, 2019

IT outages happen to companies across the globe, regardless of location, annual revenue or size. Even the most mammoth companies are at risk of downtime. Increasingly over the past few years, high-profile IT outages — defined as when the services or systems a business provides suddenly become unavailable — have ended up splashed across national news headlines ...

October 31, 2019

APM tools are ideal for an application owner or a line of business owner to track the performance of their key applications. But these tools have broader applicability to different stakeholders in an organization. In this blog, we will review the teams and functional departments that can make use of an APM tool and how they could put it to work ...