Event Management: Reactive, Proactive or Predictive?
August 01, 2012

Larry Dragich
Auto Club Group

Share this

Can event management help foster a curiosity for innovative possibilities to make application performance better? Blue-sky thinkers may not want to deal with the myriad of details on how to manage the events being generated operationally, but could learn something from this exercise.

Consider the major system failures in your organization over the last 12 to 18 months. What if you had a system or process in place to capture those failures and mitigate them from a proactive standpoint preventing them from reoccurring? How much better off would you be if you could avoid the proverbial “Groundhog Day” with system outages? The argument that system monitoring is just a nice to have, and not really a core requirement for operational readiness, dissipates quickly when a critical application goes down with no warning.

Starting with the Event management and Incident management processes may seem like a reactive approach when implementing an Application Performance Management (APM) solution, but is it really? If “Rome is burning”, wouldn’t the most prudent action be to extinguish the fire, then come up with a proactive approach for prevention? Managing the operational noise can calm the environment allowing you to focus on APM strategy more effectively.

Asking the right questions during a post-mortem review will help generate dialog, outlining options for alerting and prevention. This will direct your thinking towards a new horizon of continual improvement that will help galvanize proactive monitoring as an operational requirement.

Here are three questions that build on each other as you work to mature your solution:

1. Did we alert on it when it went down, or did the user community call us?

2. Can we get a proactive alert on it before it goes down, (e.g. dual power supply failure in server)?

3. Can we trend on the event creating a predictive alert before it is escalated, (e.g. disk space utilization to trigger a minor@90%, major@95%, critical@98%)?

The preceding questions are directly related to the following categories respectively: Reactive, Proactive, and Predictive.

Reactive – Alerts that Occur at Failure

Multiple events can occur before a system failure; eventually an alert will come in notifying you that an application is down. This will come from either the users calling the Service Desk to report an issue or it will be system generated corresponding with an application failure.

Proactive – Alerts that Occur Before Failure

These alerts will most likely come from proactive monitoring to tell you there are component failures that need attention but have not yet affected overall application availability, (e.g. dual power supply failure in server).

Predictive – Alerts that Trend on a Possible Failure

These alerts are usually set up in parallel with trending reports that will help predict subtle changes in the environment, (e.g. trending on memory usage or disk utilization before running out of resources).


Conclusion

Once you build awareness in the organization that you have a bird’s eye view of the technical landscape and have the ability to monitor the ecosystem of each application (as an ecologist), people become more meticulous when introducing new elements into the environment. They know that you are watching, taking samples, and trending on the overall health and stability leaving you free to focus on the strategic side of APM without distraction.

ABOUT Larry Dragich

Larry Dragich, a regular blogger and contributor on APMdigest, has 23 years of IT experience, and has been in an IT leadership role at the Auto Club Group (ACG) for the past ten years. He serves as Director of Enterprise Application Services (EAS) at the Auto Club Group with overall accountability to optimize the capability of the IT infrastructure to deliver high availability and optimal performance. Dragich is actively involved with industry leaders sharing knowledge of APM technologies from best practices, technical workflows, to resource allocation and approaches for implementation of APM Strategies.

You can contact Larry on LinkedIn

Related Links:

For a high-level view of a much broader technology space refer to the slide show on BrightTALK.com which describes the “The Anatomy of APM - webcast” in more context.

For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM and the corresponding blog APM’s DNA – Event to Incident Flow.

Prioritizing Gartner's APM Model

APM and MoM – Symbiotic Solution Sets

Share this

The Latest

August 24, 2016

While service catalogs are not new, they are becoming increasingly critical to enterprises seeking to optimize IT efficiencies, service delivery and business outcomes. They are also a way of supporting both enterprise and IT services, as well as optimizing IT for cost and value with critical metrics and insights. In this blog, we'll look at how and why service catalogs are becoming ever more important both to IT organizations and to the businesses and organizations they serve ...

August 23, 2016

What is needed to create a next-generation network management tool? Nothing less than the development of a sophisticated network-aware orchestration engine that is able to detect any interdependencies, resolve them and deploy network policies automatically over the network ...

August 22, 2016

The challenge today for network operations (NetOps) is how to maintain and evolve the network while demand for network services continues to grow. Software-Defined Networking (SDN) promises to make the network more agile and adaptable. Various solutions exist, yet most are missing a layer to orchestrate new features and policies in a standardized, automated and replicable manner while providing sufficient customization to meet enterprise-level requirements ...

August 19, 2016

ScaleArc's Summer Blockbuster Survey found that 62 percent of Americans said they would be upset if they were purchasing movie tickets and the site or app went down, and 90 percent agreed that movie ticketing websites and apps should have no downtime this summer ...

August 18, 2016

This blog talks about end-user expectations in terms of felt or experienced performance of applications or desktops delivered by technology which is called VDI, Desktop Virtualization, Remote Desktop, App Virtualization …

August 17, 2016

Monitoring your middleware platforms with a consolidated monitoring application has been shown over and over to reduce the frequency and duration of severity 1 and 2 incidents and prevent losses of revenue attributed to downtime. However, making a strong business care for end-end monitoring and middleware monitoring can be challenging and can present unique learning opportunities. Here are some recommendations to help you make a better business case ...

August 16, 2016

Organizations are embracing IoT as part of their strategic initiatives, with over 70% of respondents indicating that IoT is “essential” or “important” to their organization’s business and technical strategies, according to new research by Enterprise Management Associates (EMA), titled The Rise of the Internet of Things: Connecting Our World One Device at a Time ...

August 15, 2016

As machine and IT event data continue to become more complex – and massively abundant – IT departments are trying to manage a plethora of information. In many cases, IT departments – as well as business practice groups – manage IT data by silo, each concerned solely about their particular piece of the puzzle, and not focusing on the whole picture required to understand where their piece fits ...

August 12, 2016

One way top-tier e-commerce companies maintain their prestige and keep customers coming back is by delivering an exceptional customer experience. Users can depend on these sites for fast, reliable web interactions, and speedy and convenient transactions. Catchpoint just analyzed and ranked the top performing e-commerce companies and one thing is clear – they all make performance optimization a priority. The top three performers – Target, Apple and Walgreens – shared best practices that have allowed them to achieve their competitive edge ...

August 11, 2016

Web application load times can make the difference between your e-business thriving or dying. Speedy load times are so essential to a web application’s success that they should be considered a key performance indicator ...