Event Management: Reactive, Proactive or Predictive?
August 01, 2012
Larry Dragich

Can event management help foster a curiosity for innovative possibilities to make application performance better? Blue-sky thinkers may not want to deal with the myriad of details on how to manage the events being generated operationally, but could learn something from this exercise.

Consider the major system failures in your organization over the last 12 to 18 months. What if you had a system or process in place to capture those failures and mitigate them from a proactive standpoint preventing them from reoccurring? How much better off would you be if you could avoid the proverbial “Groundhog Day” with system outages? The argument that system monitoring is just a nice to have, and not really a core requirement for operational readiness, dissipates quickly when a critical application goes down with no warning.

Starting with the Event management and Incident management processes may seem like a reactive approach when implementing an Application Performance Management (APM) solution, but is it really? If “Rome is burning”, wouldn’t the most prudent action be to extinguish the fire, then come up with a proactive approach for prevention? Managing the operational noise can calm the environment allowing you to focus on APM strategy more effectively.

Asking the right questions during a post-mortem review will help generate dialog, outlining options for alerting and prevention. This will direct your thinking towards a new horizon of continual improvement that will help galvanize proactive monitoring as an operational requirement.

Here are three questions that build on each other as you work to mature your solution:

1. Did we alert on it when it went down, or did the user community call us?

2. Can we get a proactive alert on it before it goes down, (e.g. dual power supply failure in server)?

3. Can we trend on the event creating a predictive alert before it is escalated, (e.g. disk space utilization to trigger a minor@90%, major@95%, critical@98%)?

The preceding questions are directly related to the following categories respectively: Reactive, Proactive, and Predictive.

Reactive – Alerts that Occur at Failure

Multiple events can occur before a system failure; eventually an alert will come in notifying you that an application is down. This will come from either the users calling the Service Desk to report an issue or it will be system generated corresponding with an application failure.

Proactive – Alerts that Occur Before Failure

These alerts will most likely come from proactive monitoring to tell you there are component failures that need attention but have not yet affected overall application availability, (e.g. dual power supply failure in server).

Predictive – Alerts that Trend on a Possible Failure

These alerts are usually set up in parallel with trending reports that will help predict subtle changes in the environment, (e.g. trending on memory usage or disk utilization before running out of resources).


Conclusion

Once you build awareness in the organization that you have a bird’s eye view of the technical landscape and have the ability to monitor the ecosystem of each application (as an ecologist), people become more meticulous when introducing new elements into the environment. They know that you are watching, taking samples, and trending on the overall health and stability leaving you free to focus on the strategic side of APM without distraction.

ABOUT Larry Dragich

Larry Dragich, a regular blogger and contributor on APMdigest, has 23 years of IT experience, and has been in an IT leadership role at the Auto Club Group (ACG) for the past ten years. He serves as Director of Enterprise Application Services (EAS) at the Auto Club Group with overall accountability to optimize the capability of the IT infrastructure to deliver high availability and optimal performance. Dragich is actively involved with industry leaders sharing knowledge of APM technologies from best practices, technical workflows, to resource allocation and approaches for implementation of APM Strategies.

You can contact Larry on LinkedIn

Related Links:

For a high-level view of a much broader technology space refer to the slide show on BrightTALK.com which describes the “The Anatomy of APM - webcast” in more context.

For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM and the corresponding blog APM’s DNA – Event to Incident Flow.

Prioritizing Gartner's APM Model

APM and MoM – Symbiotic Solution Sets

The Latest

September 03, 2015

Let's say you are providing a marketing automation system to an enterprise that will run its global web activities over your system. You have promised them 95% availability and suitable performance from the USA east and west coasts, UK, Germany and India. What can you, the service provider, do to get most out of SLAs? These three steps will help you look at SLAs as an opportunity than a restriction ...

September 02, 2015

In Part 3 of a three-part series on change management, I’ll look at how the technologies for service modeling, automation, visualization, and self-service are evolving to address the more dynamic demands of trends such as cloud, agile, and mobile ...

September 01, 2015

With the inevitable zombie apocalypse, having the right strategies to combat the plague will be essential. Turns out that trouble-shooting application performance isn’t much different. As any good zombie fighter will tell you, in a pandemic that threatens to consume all humanity, it’ll be important to find the first person infected – called “patient zero”. Knowing that sucker's history can help determine how and when the infection started, and with a bit of luck, a way to stop it. You might scoff, but there are many parallels between this and the way we manage application performance. Ok, perhaps not on a World War Z scale, but still troublesome enough to bite your business where it hurts most ...

August 31, 2015

In Part 3 of a three-part interview, AppDynamics talks about Unified Monitoring, analytics and the AppDynamics Summer 15 release ...

August 28, 2015

In Part 2 of a three-part interview, AppDynamics talks about Application Performance Management for cloud and mobile ...

August 27, 2015

In Part 1 of a three-part interview, AppDynamics talks about Application Performance Management, monitoring and the 2015 APM Tools Survey, conducted by Enterprise Management Associates (EMA) ...

August 26, 2015

For the business, application performance is only relevant if it correlates to meaningful user experiences and conversion metrics. The most common challenge hindering companies from realizing the full promise of application performance solutions has been the lack of a common language, and business-relevant metrics to measure monitor and set targets for customer experiences. The organizational divisions that separate development, IT operations and business teams have led to varied and disparate perspectives on end-user experience, how performance impacts business, and the level of investments needed to consistently excel. To really move beyond the traditional APM mindset, where performance is seen as a technical problem, marketing and business leaders across global industries are in need of new approach to monitoring. An approach that starts and end with the user experience ...

August 25, 2015

This is Part 2 of a three-part series on change management. In this blog, I’ll look at what it takes to make change management initiatives succeed — including metrics and requirements, best practice concerns, and some of the more common pitfalls ...

August 24, 2015

Sixty percent of those surveyed had apps created internally, while 35 percent had custom apps created by a third party, according to the 2015 Enterprise Mobility Report, from Apperian with the help of CITO Research ...

August 20, 2015

Circonus conducted a survey at the recent ChefConf show. Some of the results were what we expected, especially of such a DevOps-oriented audience. Other results were surprising, as we tried to gauge, for example, how far along people were on their DevOps journey and, in particular, what the new DevOps requirements were for monitoring tools ...

Share this