Event Management: Reactive, Proactive or Predictive?
August 01, 2012

Larry Dragich
Auto Club Group

Share this

Can event management help foster a curiosity for innovative possibilities to make application performance better? Blue-sky thinkers may not want to deal with the myriad of details on how to manage the events being generated operationally, but could learn something from this exercise.

Consider the major system failures in your organization over the last 12 to 18 months. What if you had a system or process in place to capture those failures and mitigate them from a proactive standpoint preventing them from reoccurring? How much better off would you be if you could avoid the proverbial “Groundhog Day” with system outages? The argument that system monitoring is just a nice to have, and not really a core requirement for operational readiness, dissipates quickly when a critical application goes down with no warning.

Starting with the Event management and Incident management processes may seem like a reactive approach when implementing an Application Performance Management (APM) solution, but is it really? If “Rome is burning”, wouldn’t the most prudent action be to extinguish the fire, then come up with a proactive approach for prevention? Managing the operational noise can calm the environment allowing you to focus on APM strategy more effectively.

Asking the right questions during a post-mortem review will help generate dialog, outlining options for alerting and prevention. This will direct your thinking towards a new horizon of continual improvement that will help galvanize proactive monitoring as an operational requirement.

Here are three questions that build on each other as you work to mature your solution:

1. Did we alert on it when it went down, or did the user community call us?

2. Can we get a proactive alert on it before it goes down, (e.g. dual power supply failure in server)?

3. Can we trend on the event creating a predictive alert before it is escalated, (e.g. disk space utilization to trigger a minor@90%, major@95%, critical@98%)?

The preceding questions are directly related to the following categories respectively: Reactive, Proactive, and Predictive.

Reactive – Alerts that Occur at Failure

Multiple events can occur before a system failure; eventually an alert will come in notifying you that an application is down. This will come from either the users calling the Service Desk to report an issue or it will be system generated corresponding with an application failure.

Proactive – Alerts that Occur Before Failure

These alerts will most likely come from proactive monitoring to tell you there are component failures that need attention but have not yet affected overall application availability, (e.g. dual power supply failure in server).

Predictive – Alerts that Trend on a Possible Failure

These alerts are usually set up in parallel with trending reports that will help predict subtle changes in the environment, (e.g. trending on memory usage or disk utilization before running out of resources).


Conclusion

Once you build awareness in the organization that you have a bird’s eye view of the technical landscape and have the ability to monitor the ecosystem of each application (as an ecologist), people become more meticulous when introducing new elements into the environment. They know that you are watching, taking samples, and trending on the overall health and stability leaving you free to focus on the strategic side of APM without distraction.

ABOUT Larry Dragich

Larry Dragich, a regular blogger and contributor on APMdigest, has 23 years of IT experience, and has been in an IT leadership role at the Auto Club Group (ACG) for the past ten years. He serves as Director of Enterprise Application Services (EAS) at the Auto Club Group with overall accountability to optimize the capability of the IT infrastructure to deliver high availability and optimal performance. Dragich is actively involved with industry leaders sharing knowledge of APM technologies from best practices, technical workflows, to resource allocation and approaches for implementation of APM Strategies.

You can contact Larry on LinkedIn

Related Links:

For a high-level view of a much broader technology space refer to the slide show on BrightTALK.com which describes the “The Anatomy of APM - webcast” in more context.

For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM and the corresponding blog APM’s DNA – Event to Incident Flow.

Prioritizing Gartner's APM Model

APM and MoM – Symbiotic Solution Sets

Share this

The Latest

June 24, 2016

APMdigest asked the top minds in the industry what they feel is the most important way Application Performance Management (APM) tools must evolve. The recommendations on this list provide a rare look into the long-term future of APM technology. Part 2 covers the relationship between APM and analytics ...

June 23, 2016

At the end of every year, APMdigest takes a look into the future by asking experts to predict the changes that will occur within the Application Performance Management (APM) industry in the coming new year. With this new list, we are looking even farther into the future, to the evolution of APM. This list is comprised of expert opinions on how APM should evolve – an evolutionary wish list for APM. The recommendations on this list provide a rare look into the long-term future of APM technology ...

June 22, 2016

One of the most noteworthy elements of this year's State of DevOps Report is the continued advancement of concrete metrics, and notably ROI calculations, useful in determining the level of impact that organizations are appreciating via use of the practices ...

June 21, 2016

Application performance levels too often fail to meet the needs of the business. This creates what I call a "performance gap" – a widening gulf between the needs of business and what IT is able to provide (or not) to meet those needs. The business impacts include more unhappy customers, contract delays, missed deadlines and lost revenue. So in Part 2 of this series, let's examine the four key elements any organization can address today to bridge this gap ...

June 20, 2016

The technology landscape is littered with confusing terminology. The term "monitoring," for example, can mean any number of things, and while more specified terms like application performance monitoring, network performance monitoring, or infrastructure monitoring are supposed to narrow it down, there is often overlap and confusion into what is supposed to go where. Here are several key areas to focus on when evaluating your next IT purchase ...

June 17, 2016

The demand for real-time collaboration has introduced new performance requirements for enterprise networks to deliver a great user experience. A recent study conducted by BT and InfoVista, Meeting the Network Demands of Changing Generations, found that 90 percent of today’s workforce is unsatisfied with the application performance on their employer’s network overall ...

June 16, 2016

In this blog I'd like to highlight one very critical area of AIA that came out in my research: the growing role of security as an integrated requirement for performance, change and capacity management ...

June 15, 2016

Network communications are a top priority for DevOps teams working in support of modern globally-distributed systems and microservices. But basic network interface statistics like received and sent traffic aren't as useful as they once were because multiple microservices may share the same network interface. For meaningful analysis, you need to dig deeper and correlate network-traffic metrics with individual processes. This is however just the beginning ...

June 14, 2016

The global distributed performance and availability management software market is expected to grow at a CAGR of more than 13% until 2020, according to Technavio analysts.

June 13, 2016

If your company has experience in developing applications or performance management solutions, then you might want to consider joining an APM vendor's ecosystem to grow revenue. Here is how it should work: you develop market solutions incorporating your industry and technology experience, the vendor sells the solution globally through multiple channels, and you collect your check each month. The key is developing solutions for a market, not just one customer ...