This blog is the corollary to The Anatomy of APM which outlines four foundational elements of a successful Application Performance Management strategy: Top Down Monitoring, Bottom Up Monitoring, Reporting, and Incident Management. Here I provide a deeper context on how the event-to-incident flow is structured.
It is the correlation of events and the amalgamation of metrics that bring value to the business by way of dashboards and trending reports, and it's the way the business interprets the accuracy of those metrics that determines the success of the implementation. If an event occurs and no one sees it, believes it, or takes action on it, APM's value can be severely diminished and you run the risk of owning “shelfware.”
Overall, as events are detected and consumed by the system, it is the automation that is the lifeblood of an APM solution, ensuring the pulse of the incident flow is a steady one. The goal is to show a conceptual view of how events flow through the environment and eventually become incidents. At a high level, the Trouble Ticket Interface (TTI) will correlate the events into alerts, and alerts into incidents which then become tickets, enabling the Operations team to begin working toward resolution.
The event flow moves from the outside in, and then from the center to the right.
Here is how it’s managed:
- The outside blue circles represent the monitoring tool sets that collect information directly from the infrastructure and the critical applications.
- The inner green (teal) circles represent the tool sets the Enterprise Systems Management (ESM) team manages, and is where most of the critical application thresholds are set.
- The dark brown circles are logical connection points depicting how the events are collected as they flow through the system – Once the events hit this connection point they go to 3 output queues.
- The red circles on the right are the Incident Output queues for each event after it has been tracked and correlated.
The transformation between event-to-incident is the critical junction where APM and ITIL come together to provide tangible value back to the business. So if you only take one thing away from this picture it would be the importance of managing the strategic intent of the output queues, because this is the key for managing action, going red to green, and trending.
I'm suggesting that it is not necessarily the number of features or technical stamina of each monitoring tool to process large volumes of data that will make an APM implementation successful - it's the choices you make in how you put them together to manage the event-to-incident flow that determines your success. Timeliness and accuracy in this area will help you gain credibility and confidence with each of your constituents and business partners you support.
You can contact Larry on LinkedIn
For a high-level view of a much broader technology space refer to slide show on BrightTALK.com which describes the “The Anatomy of APM - webcast” in more context.
For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM.