By embracing End-User-Experience (EUE) measurements as a key vehicle for demonstrating productivity, you build trust with your constituents in a very tangible way. The translation of IT metrics into business meaning (value) is what APM is all about.
The goal here is to simplify a complicated technology space by walking through a high-level view within each core element. I’m suggesting that the success factors in APM adoption center around the EUE and the integration touch points with the Incident Management process.
When looking at APM at 20,000 feet, four foundational elements come into view:
- Top Down Monitoring (RUM)
- Bottom Up Monitoring (Infrastructure)
- Incident Management Process (ITIL)
- Reporting (Metrics)
Top Down Monitoring
Top Down Monitoring is also referred to as Real-time Application Monitoring that focuses on the End-User-Experience. It has two has two components, Passive and Active. Passive monitoring is usually an agentless appliance which leverages network port mirroring. This low risk implementation provides one of the highest values within APM in terms of application visibility for the business.
Active monitoring, on the other hand, consists of synthetic probes and web robots which help report on system availability and predefined business transactions. This is a good complement when used with passive monitoring to help provide visibility on application health during off peak hours when transaction volume is low.
Bottom Up Monitoring
Bottom Up Monitoring is also referred to as Infrastructure Monitoring which usually ties into an operations manager tool and becomes the central collection point where event correlation happens. Minimally, at this level up/down monitoring should be in place for all nodes/servers within the environment. System automation is the key component to the timeliness and accuracy of incidents being created through the Trouble Ticket Interface.
Incident Management Process
The Incident Management Process as defined in ITIL is a foundational pillar to support Application Performance Management (APM). In our situation, Incident Management, Problem Management, and Change Management processes were already established in the culture for a year prior to us beginning to implement the APM strategies.
A look into ITIL's Continual Service Improvement (CSI) model and the benefits of Application Performance Management indicates they are both focused on improvement, with APM defining toolsets that tie together specific processes in Service Design, Service Transition, and Service Operation.
Capturing the raw data for analysis is essential for an APM strategy to be successful. It is important to arrive at a common set of metrics that you will collect and then standardize on a common view on how to present the real-time performance data.
Your best bet: Alert on the Averages and Profile with Percentiles. Use 5 minute averages for real-time performance alerting, and percentiles for overall application profiling and Service Level Management.
As you go deeper in your exploration of APM and begin sifting through the technical dogma (e.g. transaction tagging, script injection, application profiling, stitching engines, etc.) for key decision points, take a step back and ask yourself why you're doing this in the first place: To translate IT metrics into an End-User-Experience that provides value back to the business.
If you have questions on the approach and what you should focus on first with APM, see Prioritizing Gartner's APM Model for insight on some best practices from the field.
You can contact Larry on LinkedIn
For a high-level view of a much broader technology space refer to slide show on BrightTALK.com which describes “The Anatomy of APM - webcast” in more context.
The results are in from Data-Driven IT Automation: A Vision for the Modern CIO. We were overall very pleased with the data, which was consistent and sometimes even revelatory ...
The role of the CIO is evolving with more of a focus on revenue and strategy, according to the 2019 Global CIO Survey from Logicalis ...
Organizations face major infrastructure and security challenges in supporting multi-cloud and edge deployments, according to new global survey conducted by Propeller Insights for Volterra ...
Developers spend roughly 17.3 hours each week debugging, refactoring and modifying bad code — valuable time that could be spent writing more code, shipping better products and innovating. The bottom line? Nearly $300B (US) in lost developer productivity every year ...
While remote work policies have been gaining steam for the better part of the past decade across the enterprise space — driven in large part by more agile and scalable, cloud-delivered business solutions — recent events have pushed adoption into overdrive ...
Time-critical, unplanned work caused by IT disruptions continues to plague enterprises around the world, leading to lost revenue, significant employee morale problems and missed opportunities to innovate, according to the State of Unplanned Work Report 2020, conducted by Dimensional Research for PagerDuty ...
In today's iterative world, development teams care a lot more about how apps are running. There's a demand for fixing actionable items. Developers want to know exactly what's broken, what to fix right now, and what can wait. They want to know, "Do we build or fix?" This trade-off between building new features versus fixing bugs is one of the key factors behind the adoption of Application Stability management tools ...
With the rise of mobile apps and iterative development releases, Application Stability has answered the widespread need to monitor applications in a new way, shifting the focus from servers and networks to the customer experience. The emergence of Application Stability has caused some consternation for diehard APM fans. However, these two solutions embody very distinct monitoring focuses, which leads me to believe there's room for both tools, as well as different teams for both ...
The 2019 State of E-Commerce Infrastructure Report, from Webscale, analyzes findings from a comprehensive survey of more than 450 ecommerce professionals regarding how their online stores performed during the 2019 holiday season. Some key insights from the report include ...
Robinhood is a unicorn startup that has been disrupting the way by which many millennials have been investing and managing their money for the past few years. For Robinhood, the burden of proof was to show that they can provide an infrastructure that is as scalable, reliable and secure as that of major banks who have been developing their trading infrastructure for the last quarter-century. That promise fell flat last week, when the market volatility brought about a set of edge cases that brought Robinhood's trading app to its knees ...