Event Management: Reactive, Proactive or Predictive?
August 01, 2012

Larry Dragich
Technology Executive

Share this

Can event management help foster a curiosity for innovative possibilities to make application performance better? Blue-sky thinkers may not want to deal with the myriad of details on how to manage the events being generated operationally, but could learn something from this exercise.

Consider the major system failures in your organization over the last 12 to 18 months. What if you had a system or process in place to capture those failures and mitigate them from a proactive standpoint preventing them from reoccurring? How much better off would you be if you could avoid the proverbial “Groundhog Day” with system outages? The argument that system monitoring is just a nice to have, and not really a core requirement for operational readiness, dissipates quickly when a critical application goes down with no warning.

Starting with the Event management and Incident management processes may seem like a reactive approach when implementing an Application Performance Management (APM) solution, but is it really? If “Rome is burning”, wouldn’t the most prudent action be to extinguish the fire, then come up with a proactive approach for prevention? Managing the operational noise can calm the environment allowing you to focus on APM strategy more effectively.

Asking the right questions during a post-mortem review will help generate dialog, outlining options for alerting and prevention. This will direct your thinking towards a new horizon of continual improvement that will help galvanize proactive monitoring as an operational requirement.

Here are three questions that build on each other as you work to mature your solution:

1. Did we alert on it when it went down, or did the user community call us?

2. Can we get a proactive alert on it before it goes down, (e.g. dual power supply failure in server)?

3. Can we trend on the event creating a predictive alert before it is escalated, (e.g. disk space utilization to trigger a minor@90%, major@95%, critical@98%)?

The preceding questions are directly related to the following categories respectively: Reactive, Proactive, and Predictive.

Reactive – Alerts that Occur at Failure

Multiple events can occur before a system failure; eventually an alert will come in notifying you that an application is down. This will come from either the users calling the Service Desk to report an issue or it will be system generated corresponding with an application failure.

Proactive – Alerts that Occur Before Failure

These alerts will most likely come from proactive monitoring to tell you there are component failures that need attention but have not yet affected overall application availability, (e.g. dual power supply failure in server).

Predictive – Alerts that Trend on a Possible Failure

These alerts are usually set up in parallel with trending reports that will help predict subtle changes in the environment, (e.g. trending on memory usage or disk utilization before running out of resources).


Conclusion

Once you build awareness in the organization that you have a bird’s eye view of the technical landscape and have the ability to monitor the ecosystem of each application (as an ecologist), people become more meticulous when introducing new elements into the environment. They know that you are watching, taking samples, and trending on the overall health and stability leaving you free to focus on the strategic side of APM without distraction.

You can contact Larry on LinkedIn

Related Links:

For a high-level view of a much broader technology space refer to the slide show on BrightTALK.com which describes the “The Anatomy of APM - webcast” in more context.

For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM and the corresponding blog APM’s DNA – Event to Incident Flow.

Prioritizing Gartner's APM Model

APM and MoM – Symbiotic Solution Sets

Larry Dragich is a Technology Executive and Founder of the APM Strategies Group on LinkedIn
Share this

The Latest

June 21, 2018

There’s no doubt that digital innovations are transforming industries, and business leaders are left with little or no choice – either embrace digital processes or suffer the consequences and get left behind ...

June 20, 2018

Looking ahead to the rest of 2018 and beyond, it seems like many of the trends that shaped 2017 are set to continue, with the key difference being in how they evolve and shift as they become mainstream. Five key factors defining the progression of the digital transformation movement are ...

June 19, 2018

Companies using cloud technologies to automate their legacy applications and IT operations processes are gaining a significant competitive advantage over those behind the curve, according to a new report from Capgemini and Sogeti, The automation advantage: Making legacy IT keep pace with the cloud ...

June 18, 2018

It's every system administrator's worse nightmare. An attempt to restore a database results in empty files, and there is no way to get the data back, ever. Here are five simple tips for keeping things running smoothly and minimizing risk ...

June 15, 2018

When it comes to their own companies, 50% of IT stakeholders think they are leaders and will disrupt, while 50% feel they are behind and will be disrupted by the competition in 2018, according to a new survey of IT stakeholders from Alfresco Software and Dimensional Research. The report, Digital Disruption: Disrupt or Be Disrupted, is a wake-up call for the C-suite ...

June 14, 2018

If you are like most IT professionals, which I am sure you are, you are dealing with a lot issues. Typical issues include ...

June 13, 2018

The importance of artificial intelligence and machine learning for customer insight, product support, operational efficiency, and capacity planning are well-established, however, the benefits of monitoring data in those use cases is still evolving. Three main factors obscuring the benefits of data monitoring are the infinite volume of data, its diversity, and inconsistency ...

June 11, 2018

Imagine this: after a fantastic night's sleep, you walk into the office ready to attack the day. You sit down at your desk ready to go, and your computer starts acting up. You call the help desk, but all IT can do is create a ticket for you and transfer it to another team to help you as soon as possible ...

June 08, 2018

As many IT workers develop greater technology skills and apply them to advance their careers, many digital workers in non-IT departments believe their CIO is out of touch with their technology needs. A Gartner, Inc. survey found that less than 50 percent of workers (both IT and non-IT) believe their CIOs are aware of digital technology problems that affect them ...

June 07, 2018

CIOs of 73% of organizations say the need for speed in digital innovation is putting customer experience at risk, according to an independent global survey of 800 CIOs commissioned by Dynatrace ...