Event Management: Reactive, Proactive or Predictive?
August 01, 2012
Larry Dragich

Can event management help foster a curiosity for innovative possibilities to make application performance better? Blue-sky thinkers may not want to deal with the myriad of details on how to manage the events being generated operationally, but could learn something from this exercise.

Consider the major system failures in your organization over the last 12 to 18 months. What if you had a system or process in place to capture those failures and mitigate them from a proactive standpoint preventing them from reoccurring? How much better off would you be if you could avoid the proverbial “Groundhog Day” with system outages? The argument that system monitoring is just a nice to have, and not really a core requirement for operational readiness, dissipates quickly when a critical application goes down with no warning.

Starting with the Event management and Incident management processes may seem like a reactive approach when implementing an Application Performance Management (APM) solution, but is it really? If “Rome is burning”, wouldn’t the most prudent action be to extinguish the fire, then come up with a proactive approach for prevention? Managing the operational noise can calm the environment allowing you to focus on APM strategy more effectively.

Asking the right questions during a post-mortem review will help generate dialog, outlining options for alerting and prevention. This will direct your thinking towards a new horizon of continual improvement that will help galvanize proactive monitoring as an operational requirement.

Here are three questions that build on each other as you work to mature your solution:

1. Did we alert on it when it went down, or did the user community call us?

2. Can we get a proactive alert on it before it goes down, (e.g. dual power supply failure in server)?

3. Can we trend on the event creating a predictive alert before it is escalated, (e.g. disk space utilization to trigger a minor@90%, major@95%, critical@98%)?

The preceding questions are directly related to the following categories respectively: Reactive, Proactive, and Predictive.

Reactive – Alerts that Occur at Failure

Multiple events can occur before a system failure; eventually an alert will come in notifying you that an application is down. This will come from either the users calling the Service Desk to report an issue or it will be system generated corresponding with an application failure.

Proactive – Alerts that Occur Before Failure

These alerts will most likely come from proactive monitoring to tell you there are component failures that need attention but have not yet affected overall application availability, (e.g. dual power supply failure in server).

Predictive – Alerts that Trend on a Possible Failure

These alerts are usually set up in parallel with trending reports that will help predict subtle changes in the environment, (e.g. trending on memory usage or disk utilization before running out of resources).


Conclusion

Once you build awareness in the organization that you have a bird’s eye view of the technical landscape and have the ability to monitor the ecosystem of each application (as an ecologist), people become more meticulous when introducing new elements into the environment. They know that you are watching, taking samples, and trending on the overall health and stability leaving you free to focus on the strategic side of APM without distraction.

ABOUT Larry Dragich

Larry Dragich, a regular blogger and contributor on APMdigest, has 23 years of IT experience, and has been in an IT leadership role at the Auto Club Group (ACG) for the past ten years. He serves as Director of Enterprise Application Services (EAS) at the Auto Club Group with overall accountability to optimize the capability of the IT infrastructure to deliver high availability and optimal performance. Dragich is actively involved with industry leaders sharing knowledge of APM technologies from best practices, technical workflows, to resource allocation and approaches for implementation of APM Strategies.

You can contact Larry on LinkedIn

Related Links:

For a high-level view of a much broader technology space refer to the slide show on BrightTALK.com which describes the “The Anatomy of APM - webcast” in more context.

For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM and the corresponding blog APM’s DNA – Event to Incident Flow.

Prioritizing Gartner's APM Model

APM and MoM – Symbiotic Solution Sets

The Latest

February 27, 2015

n Part 2 of this interview, Cameron Haight, Gartner Research VP, IT Operations, discusses the focus of his research for the last few years: DevOps ...

February 26, 2015

Cameron Haight, Gartner Research VP, IT Operations, has replaced Jonah Kowall as Gartner's leading Application Performance Management (APM) specialist, since Kowall has taken a VP position at AppDynamics. In Part 1 of this interview, Cameron Haight discusses his background, and the focus of his research for the last few years: DevOps ...

February 25, 2015

Cloud computing adoption is increasing globally, with the majority of respondents citing hybrid cloud as their preferred option for greater agility and security, according to a study conducted by Vanson Bourne for EMC Corporation/

February 24, 2015

A shift to continuous integration (CI) and other agile methodologies is driving a massive change in the way that development and testing professionals approach testing, according to an independent, global developer survey titled Web and Mobile Testing Trends ...

February 23, 2015

In my last post, I shared some key findings from an EMA research report published last fall that dove into the ways in which log analytics is being used to support network operations. Building on that, following are five recommendations that EMA is making on how best to think about log data as part of an integrated management architecture and strategy ...

February 20, 2015

Performance testing is imperative for applications to perform as expected in the real world. In particular, business-critical applications need thorough testing to ensure they can bear the stresses and strains of varying demands.

February 19, 2015

Unplanned application downtime costs the Fortune 1000 from $1.25 billion to $2.5 billion every year, according to an IDC report: DevOps and the Cost of Downtime: Fortune 1000 Best Practice Metrics Quantified. The report brings to light the real costs and impact of outages, and offers real insights into the adoption and impact of DevOps practices in large enterprises ...

February 18, 2015

For a successful application rollout, it is vital to assess the user experience appropriately and have an understanding of how the new app impacts your already deployed apps and infrastructure. This requires a great deal of preparation across various IT functions, from network to application teams. To put your team on the path to a successful rollout, take the time to consider the following points before the wide-scale launch ...

February 17, 2015

With agile and lean influencing our thinking, it’s perhaps no surprise that the impetus behind DevOps has come from development. That’s great for the speed side of the equation, but success requires that IT operations also modify their practices. This means ensuring that Application Performance Management (APM) tools and processes are not only supporting the resilience and service goals of production systems, but that they exhibit the functionality needed to help improve customer experience – even as applications are developed, tested, released and deployed. APM can accelerate the benefits of DevOps, but where do you start and what tools do you use? The tech landscape is littered with many products and services all claiming to be the secret sauce that’s going to support a DevOps-like culture. But don’t be fooled, modern APM can only accelerate DevOps when it exhibits four fundamental characteristics. Quite simply it has to be "EPIC" ...

February 16, 2015

Ensuring application performance is a never ending task that involves multiple products, features and best practices. There is no one process, feature, or product that does everything. A good place to start is pre-production and production monitoring with both an APM tool and a Unified Monitoring tool ...

Share this