Event Management: Reactive, Proactive or Predictive?
August 01, 2012
Larry Dragich

Can event management help foster a curiosity for innovative possibilities to make application performance better? Blue-sky thinkers may not want to deal with the myriad of details on how to manage the events being generated operationally, but could learn something from this exercise.

Consider the major system failures in your organization over the last 12 to 18 months. What if you had a system or process in place to capture those failures and mitigate them from a proactive standpoint preventing them from reoccurring? How much better off would you be if you could avoid the proverbial “Groundhog Day” with system outages? The argument that system monitoring is just a nice to have, and not really a core requirement for operational readiness, dissipates quickly when a critical application goes down with no warning.

Starting with the Event management and Incident management processes may seem like a reactive approach when implementing an Application Performance Management (APM) solution, but is it really? If “Rome is burning”, wouldn’t the most prudent action be to extinguish the fire, then come up with a proactive approach for prevention? Managing the operational noise can calm the environment allowing you to focus on APM strategy more effectively.

Asking the right questions during a post-mortem review will help generate dialog, outlining options for alerting and prevention. This will direct your thinking towards a new horizon of continual improvement that will help galvanize proactive monitoring as an operational requirement.

Here are three questions that build on each other as you work to mature your solution:

1. Did we alert on it when it went down, or did the user community call us?

2. Can we get a proactive alert on it before it goes down, (e.g. dual power supply failure in server)?

3. Can we trend on the event creating a predictive alert before it is escalated, (e.g. disk space utilization to trigger a minor@90%, major@95%, critical@98%)?

The preceding questions are directly related to the following categories respectively: Reactive, Proactive, and Predictive.

Reactive – Alerts that Occur at Failure

Multiple events can occur before a system failure; eventually an alert will come in notifying you that an application is down. This will come from either the users calling the Service Desk to report an issue or it will be system generated corresponding with an application failure.

Proactive – Alerts that Occur Before Failure

These alerts will most likely come from proactive monitoring to tell you there are component failures that need attention but have not yet affected overall application availability, (e.g. dual power supply failure in server).

Predictive – Alerts that Trend on a Possible Failure

These alerts are usually set up in parallel with trending reports that will help predict subtle changes in the environment, (e.g. trending on memory usage or disk utilization before running out of resources).


Conclusion

Once you build awareness in the organization that you have a bird’s eye view of the technical landscape and have the ability to monitor the ecosystem of each application (as an ecologist), people become more meticulous when introducing new elements into the environment. They know that you are watching, taking samples, and trending on the overall health and stability leaving you free to focus on the strategic side of APM without distraction.

ABOUT Larry Dragich

Larry Dragich, a regular blogger and contributor on APMdigest, has 23 years of IT experience, and has been in an IT leadership role at the Auto Club Group (ACG) for the past ten years. He serves as Director of Enterprise Application Services (EAS) at the Auto Club Group with overall accountability to optimize the capability of the IT infrastructure to deliver high availability and optimal performance. Dragich is actively involved with industry leaders sharing knowledge of APM technologies from best practices, technical workflows, to resource allocation and approaches for implementation of APM Strategies.

You can contact Larry on LinkedIn

Related Links:

For a high-level view of a much broader technology space refer to the slide show on BrightTALK.com which describes the “The Anatomy of APM - webcast” in more context.

For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM and the corresponding blog APM’s DNA – Event to Incident Flow.

Prioritizing Gartner's APM Model

APM and MoM – Symbiotic Solution Sets

The Latest

May 22, 2015

Organizations large and small are struggling to meet their Key Performance Indicator (KPI) goals and prevent IT issues before they adversely impact the business — in fact, organizations detect and address an average of only 57% of critical IT issues before they impact the business — according to Continuity Software's 2015 IT Operations Analytics Survey ...

May 21, 2015

Companies are increasing IT salaries in order to attract and retain talent in a highly competitive hiring market, and the security profession in particular is red-hot, according to IDG’s Computerworld 2015 IT Salary Survey.

May 20, 2015

Very few CMDB solutions are currently packaged as standalone options. For instance, you may already have a CMDB embedded in your service desk that’s not yet in use. However, you may decide for any number of reasons that your current investment isn’t the one to take you the whole distance going forward. Moreover, there are a growing number of variations on a theme — as some CMDBs are packaged primarily as BSM solutions optimized for service impact and performance, others target workflow and automation, and some CMDB solutions are extensions of application discovery and dependency mapping tools ...

May 19, 2015

A VMTurbo survey on OpenStack reveals increasing interest in investigating and deploying OpenStack as a private cloud infrastructure, despite recent press coverage and perceived challenges of implementation ...

May 18, 2015

It is easy to feel that so called "second generation" APM tooling rules the world. And for good reason, many would argue – certainly the positive disruptive effects of support for highly distributed / Service Orientated architectures, and the requirements of many fast moving businesses to support a plethora of different technologies are a powerful dynamic. That leaves aside the undoubted advantages of comprehensive traffic screening (as opposed to "hard" sampling), ease of installation and commissioning (relative in some cases), user accessibility, flexible reporting and tighter productive association between IT and business – in short, empowering the DevOps and PerfOps revolution. So, modern APM is certainly well attuned to the requirements of current business. What's not to like? Could these technologies have an Achilles heel? ...

May 15, 2015

Reveille has compiled industry statistics to create a new infographic that reveals a lack of in-depth visibility into business-critical Enterprise Content Management (ECM) applications’ components, processes, and service levels.

May 14, 2015

Three-fourths (75 percent) of CIO respondents stated their network is an issue in achieving their organization's goals, according to a new survey of CIOs worldwide from Brocade, conducted by independent research agency Vanson Bourne. For almost a quarter of CIOs polled, it is a "significant" issue ...

May 13, 2015

The PADS (Performance Analytics Decision Support) Framework helps companies take a more strategic approach to user experience. It's a framework that lets IT and business management understand the link between next-generation Application Performance Management (APM) and big data analytics to enable improved application governance and operational performance. Across industry sectors, companies that unify APM and user experience outperform their peer group in financial results and market valuation. These companies also use 30% fewer tools to achieve these results. The majority have consolidated onto a core platform from one vendor, with tactical deployments of other vendor solutions for specific use cases, departments or technologies. They consistently deliver stellar user experiences with greater IT productivity and lower costs than their less-performing peers ...

May 12, 2015

We conducted a performance diagnostic session on a live e-commerce website, and after our first initial glance at their landing page we saw the usual performance suspects. Some of the highlights we found on the website we analyzed during the performance clinic are below ...

May 11, 2015

Last December, my APMdigest prediction for 2015 was:"The advent of the “Internet of Things” (IoT) will elevate the importance of implementing powerful, easy-to-use and cost-effective APM solutions as a rapidly expanding universe of end-points are connected by software-enabled sensors and systems." Less than halfway through the new year and we're seeing the market activity around IoT opportunities accelerate ...

Share this