Event Management: Reactive, Proactive or Predictive?
August 01, 2012

Larry Dragich
Technology Executive

Share this

Can event management help foster a curiosity for innovative possibilities to make application performance better? Blue-sky thinkers may not want to deal with the myriad of details on how to manage the events being generated operationally, but could learn something from this exercise.

Consider the major system failures in your organization over the last 12 to 18 months. What if you had a system or process in place to capture those failures and mitigate them from a proactive standpoint preventing them from reoccurring? How much better off would you be if you could avoid the proverbial “Groundhog Day” with system outages? The argument that system monitoring is just a nice to have, and not really a core requirement for operational readiness, dissipates quickly when a critical application goes down with no warning.

Starting with the Event management and Incident management processes may seem like a reactive approach when implementing an Application Performance Management (APM) solution, but is it really? If “Rome is burning”, wouldn’t the most prudent action be to extinguish the fire, then come up with a proactive approach for prevention? Managing the operational noise can calm the environment allowing you to focus on APM strategy more effectively.

Asking the right questions during a post-mortem review will help generate dialog, outlining options for alerting and prevention. This will direct your thinking towards a new horizon of continual improvement that will help galvanize proactive monitoring as an operational requirement.

Here are three questions that build on each other as you work to mature your solution:

1. Did we alert on it when it went down, or did the user community call us?

2. Can we get a proactive alert on it before it goes down, (e.g. dual power supply failure in server)?

3. Can we trend on the event creating a predictive alert before it is escalated, (e.g. disk space utilization to trigger a minor@90%, major@95%, critical@98%)?

The preceding questions are directly related to the following categories respectively: Reactive, Proactive, and Predictive.

Reactive – Alerts that Occur at Failure

Multiple events can occur before a system failure; eventually an alert will come in notifying you that an application is down. This will come from either the users calling the Service Desk to report an issue or it will be system generated corresponding with an application failure.

Proactive – Alerts that Occur Before Failure

These alerts will most likely come from proactive monitoring to tell you there are component failures that need attention but have not yet affected overall application availability, (e.g. dual power supply failure in server).

Predictive – Alerts that Trend on a Possible Failure

These alerts are usually set up in parallel with trending reports that will help predict subtle changes in the environment, (e.g. trending on memory usage or disk utilization before running out of resources).


Conclusion

Once you build awareness in the organization that you have a bird’s eye view of the technical landscape and have the ability to monitor the ecosystem of each application (as an ecologist), people become more meticulous when introducing new elements into the environment. They know that you are watching, taking samples, and trending on the overall health and stability leaving you free to focus on the strategic side of APM without distraction.

You can contact Larry on LinkedIn

Related Links:

For a high-level view of a much broader technology space refer to the slide show on BrightTALK.com which describes the “The Anatomy of APM - webcast” in more context.

For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM and the corresponding blog APM’s DNA – Event to Incident Flow.

Prioritizing Gartner's APM Model

APM and MoM – Symbiotic Solution Sets

Larry Dragich is a Technology Executive and Founder of the APM Strategies Group on LinkedIn
Share this

The Latest

September 19, 2019

You must dive into various aspects or themes of services so that you can gauge authentic user experience. There are usually five main themes that the customer thinks of when experiencing a service ...

September 18, 2019

Service desks teams use internally focused performance-based metrics more than many might think. These metrics are essential and remain relevant, but they do not provide any insight into the user experience. To gain actual insight into user satisfaction, you need to change your metrics. The question becomes: How do I efficiently change my metrics? Then, how do you best go about it? ...

September 17, 2019

The skills gap is a very real issue impacting today's IT professionals. In preparation for IT Pro Day 2019, celebrated on September 17, 2019, SolarWinds explored this skills gap by surveying technology professionals around the world to understand their needs and how organizations are addressing these needs ...

September 16, 2019

Top performing organizations (TPOs) in managing IT Operations are experiencing significant operational and business benefits such as 5.9x shorter average Mean Time to Resolution (MTTR) per incident as compared to all other organizations, according to a new market study from Digital Enterprise Journal ...

September 12, 2019

Multichannel marketers report that mobile-friendly websites have emerged as a dominant engagement channel for their brands, according to Gartner. However, Gartner research has found that too many organizations build their mobile websites without accurate knowledge about, or regard for, their customer's mobile preferences ...

September 11, 2019

Do you get excited when you discover a new service from one of the top three public clouds or a new public cloud provider? I do. But every time you feel excited about new cloud offerings, you should also feel a twinge of fear. Because in the tech world, each time we introduce something new we also add a new point of failure for our application and potentially a service we are stuck with. This is why thinking about the long-tail cloud for your organization is important ...

September 10, 2019

A solid start to migration can be approached three ways — all of which are ladder up to adopting a Software Intelligence strategy ...

September 09, 2019

Many aren't doing the due diligence needed to properly assess and facilitate a move of applications to the cloud. This is according to the recent 2019 Cloud Migration Report which revealed half of IT leaders at banks, insurance and telecommunications companies do not conduct adequate risk assessments prior to moving apps over to the cloud. Essentially, they are going in blind and expecting everything to turn out ok. Spoiler alert: It doesn't ...

September 05, 2019

Research conducted by Aite Group uncovered more than 80 global eCommerce sites that were actively being compromised by Magecart groups, according to a new report, In Plain Sight II: On the Trail of Magecart ...

September 04, 2019

In this blog, I'd like to expand beyond the TAP and look at the role Packet Brokers play in an organization's visibility architecture. Here are 5 common mistakes that are made when deploying Packet Brokers, and how to avoid them ...