Event Management: Reactive, Proactive or Predictive?
August 01, 2012
Larry Dragich

Can event management help foster a curiosity for innovative possibilities to make application performance better? Blue-sky thinkers may not want to deal with the myriad of details on how to manage the events being generated operationally, but could learn something from this exercise.

Consider the major system failures in your organization over the last 12 to 18 months. What if you had a system or process in place to capture those failures and mitigate them from a proactive standpoint preventing them from reoccurring? How much better off would you be if you could avoid the proverbial “Groundhog Day” with system outages? The argument that system monitoring is just a nice to have, and not really a core requirement for operational readiness, dissipates quickly when a critical application goes down with no warning.

Starting with the Event management and Incident management processes may seem like a reactive approach when implementing an Application Performance Management (APM) solution, but is it really? If “Rome is burning”, wouldn’t the most prudent action be to extinguish the fire, then come up with a proactive approach for prevention? Managing the operational noise can calm the environment allowing you to focus on APM strategy more effectively.

Asking the right questions during a post-mortem review will help generate dialog, outlining options for alerting and prevention. This will direct your thinking towards a new horizon of continual improvement that will help galvanize proactive monitoring as an operational requirement.

Here are three questions that build on each other as you work to mature your solution:

1. Did we alert on it when it went down, or did the user community call us?

2. Can we get a proactive alert on it before it goes down, (e.g. dual power supply failure in server)?

3. Can we trend on the event creating a predictive alert before it is escalated, (e.g. disk space utilization to trigger a minor@90%, major@95%, critical@98%)?

The preceding questions are directly related to the following categories respectively: Reactive, Proactive, and Predictive.

Reactive – Alerts that Occur at Failure

Multiple events can occur before a system failure; eventually an alert will come in notifying you that an application is down. This will come from either the users calling the Service Desk to report an issue or it will be system generated corresponding with an application failure.

Proactive – Alerts that Occur Before Failure

These alerts will most likely come from proactive monitoring to tell you there are component failures that need attention but have not yet affected overall application availability, (e.g. dual power supply failure in server).

Predictive – Alerts that Trend on a Possible Failure

These alerts are usually set up in parallel with trending reports that will help predict subtle changes in the environment, (e.g. trending on memory usage or disk utilization before running out of resources).


Conclusion

Once you build awareness in the organization that you have a bird’s eye view of the technical landscape and have the ability to monitor the ecosystem of each application (as an ecologist), people become more meticulous when introducing new elements into the environment. They know that you are watching, taking samples, and trending on the overall health and stability leaving you free to focus on the strategic side of APM without distraction.

ABOUT Larry Dragich

Larry Dragich, a regular blogger and contributor on APMdigest, has 23 years of IT experience, and has been in an IT leadership role at the Auto Club Group (ACG) for the past ten years. He serves as Director of Enterprise Application Services (EAS) at the Auto Club Group with overall accountability to optimize the capability of the IT infrastructure to deliver high availability and optimal performance. Dragich is actively involved with industry leaders sharing knowledge of APM technologies from best practices, technical workflows, to resource allocation and approaches for implementation of APM Strategies.

You can contact Larry on LinkedIn

Related Links:

For a high-level view of a much broader technology space refer to the slide show on BrightTALK.com which describes the “The Anatomy of APM - webcast” in more context.

For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM and the corresponding blog APM’s DNA – Event to Incident Flow.

Prioritizing Gartner's APM Model

APM and MoM – Symbiotic Solution Sets

The Latest

March 27, 2015

After speaking to thousands of APM users during my time with Gartner, I have seen 5 key issues that cause APM failures ...

March 26, 2015

A new report by Radware shows that 9% of the top 100 leading retail web pages took ten or more seconds to become interactive, which is down considerably from 22% of sites last quarter ...

March 25, 2015

Everywhere you turn, the very latest IT technologies are being leveraged to provide advanced services that were unimaginable even ten years ago. So why is it that the IT environments that provide these services are managed using an analytics technology designed for the 1970s?

March 24, 2015

With the proliferation of composite applications for cloud and mobility, monitoring individual components of the application delivery chain is no longer an effective way to assure user experience. IT organizations must evolve toward a unified approach that promotes collaboration and efficiency to better align with corporate return on investment (ROI) and risk management objectives ...

March 23, 2015

Mobile and desktop applications have become the new battleground for brand loyalty, according to a global study commissioned by CA Technologies. In today’s software-driven world, where consumers are more discerning about what they expect from applications, the reality is that businesses that fail to deliver a positive application experience risk losing as much as a quarter of their customer base. The study – Software: the New Battleground for Brand Loyalty – surveyed 6,770 consumers and 809 business decision makers to uncover how each group thought various characteristics of applications impacted user experience, and how well different industries delivered on those characteristics. Consumers identified three that have the biggest impact on the consumer experience ...

March 20, 2015

Today’s CIOs face a daunting task: They must move their enterprises from a traditional organization, with some degree of optimization and automation, into the digital business age. Digital businesses are software-defined — dependent on or driven by software, and leveraging software-derived data to drive decision-making. In order to move businesses into the digital age, much needs to evolve, including innovation, leadership, organization, and culture within IT. These changes often are driven by a chief digital officer or a digitally savvy CIO ...

March 19, 2015

While most companies believe virtualization technology is a strategic priority, there are still clear risks that need to be addressed, according to a new report by Ixia entitled The State of Virtualization for Visibility Architecture 2015 ...

March 18, 2015

As March Madness continues to be a digitally driven event with a large US following, IT knows the business network will be put under additional stress and employee productivity will decline amid the tournament frenzy and all-consuming bracket. This is especially true during the first two days of the tournament when early round games take place during peak work hours. To help better prepare organizations for the oncoming flurry, we've put together our own "Final Four" list of actions every IT team can take to ensure networks don't come down with the nets ...

March 17, 2015

March Madness is basketball ecstasy for college hoops fans. But it's network agony for the organizations and IT managers forced to deal with severe strains on the network and threats of poorly performing applications. Of course, ever-increasing cloud usage and bring your own device (BYOD) policies only heighten the challenge for IT. With a little bit of proactive planning and with the right performance management tools in place, IT Ops can accurately monitor, identify and address application and network performance issues before they can impact the business. Here are a few tips to make sure administrators stay sane during March Madness ...

March 16, 2015

The phrase "The customer is always right" is ubiquitous in the business and retail world and one that companies should extend as a matter of course to refer to their employees. For IT teams, they are usually known as the "end user". It is a company’s employees who keep it running and when a network problem gets in the way not only is the end-user frustrated and annoyed, but productivity can quickly be driven to a halt ...

Share this