Real-Time Monitoring Metrics - The Magical Mundane
September 12, 2012

Larry Dragich
Technology Executive

Share this

Application Performance Management (APM) has many benefits when implemented with the right support structure and sponsorship. It's the key for managing action, going red to green, and trending on performance.

As you strive to achieve new levels of sophistication when creating performance baselines, it is important to consider how you will navigate the oscillating winds of application behavior as the numbers come in from all directions. The behavioral context of the user will highlight key threshold settings to consider as you build a framework for real-time alerting into your APM solution.

This will take an understanding of the application and an analysis of the numbers as you begin looking at user patterns. Metrics play a key role in providing this value through different views across multiple comparisons. Absent from any behavioral learning engines which are now emerging in the APM space, you can begin a high level analysis on your own to come to a common understanding of each business application's performance.

Just as water seeks its own level, an application performance baseline will eventually emerge as you track the real-time performance metrics outlining the high and low watermarks of the application. This will include the occasional anomalous wave that comes crashing through affecting the user experience as the numbers fluctuate.


Depending on transaction volume and performance characteristics there will be a certain level of noise that you will need to squelch to a volume level that can be analyzed. When crunching the numbers and distilling patterns, it will be essential to create three baseline comparisons that you will use like a compass for navigation into what is real and what is an exception.

Real-Time vs. Yesterday

As the real-time performance metrics come in, it is important to watch the application performance at least at the five minute interval as compared to the day before to see if there are any obvious changes in performance.

Real-Time vs. 7 days Ago

Comparing Monday to Sunday may not be relevant if your core business hours are M-F; using the real-time view and comparing it to the same day as the previous week will be more useful - especially if a new release of the application was rolled out over the weekend and you want to know how it compares with the previous week.

Real-Time vs. 10 Day Rolling Average

Using a 10, 15 or 30 day rolling average is helpful in reviewing overall application performance with the business, because everyone can easily understand averages and what they mean when compared against a real-time view.

Capturing real-time performance metrics in five minute intervals is a good place to start. Once you get a better understanding of the application behavior you may increase or decrease the interval as needed. For real-time performance alerting, using the averages will give you a good picture when something is out of pattern, and to report on Service Level Management using percentiles (90%, 95%, etc.), will help create and accurate view for the business. To make it simple to remember, alert on the averages and profile with percentiles.

Conclusion

Operationally there are things you may not want to think about all of the time (e.g. standard deviations, averages, percentiles, etc.), but you have to think about them long enough to create the most accurate picture possible as you begin to distill performance patterns with each business application. This can be accomplished by building meaningful performance baselines that will help feed your Service Level Management processes well into the future.

You can contact Larry on LinkedIn.

Related Links:

For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM and the corresponding blog APM’s DNA – Event to Incident Flow.

Prioritizing Gartner's APM Model

Event Management: Reactive, Proactive, or Predictive?

APM and MoM – Symbiotic Solution Sets

Larry Dragich is a Technology Executive and Founder of the APM Strategies Group on LinkedIn
Share this

The Latest

June 29, 2022

When it comes to AIOps predictions, there's no question of AI's value in predictive intelligence and faster problem resolution for IT teams. In fact, Gartner has reported that there is no future for IT Operations without AIOps. So, where is AIOps headed in five years? Here's what the vendors and thought leaders in the AIOps space had to share ...

June 27, 2022

A new study by OpsRamp on the state of the Managed Service Providers (MSP) market concludes that MSPs face a market of bountiful opportunities but must prepare for this growth by embracing complex technologies like hybrid cloud management, root cause analysis and automation ...

June 27, 2022

Hybrid work adoption and the accelerated pace of digital transformation are driving an increasing need for automation and site reliability engineering (SRE) practices, according to new research. In a new survey almost half of respondents (48.2%) said automation is a way to decrease Mean Time to Resolution/Repair (MTTR) and improve service management ...

June 23, 2022

Digital businesses don't invest in monitoring for monitoring's sake. They do it to make the business run better. Every dollar spent on observability — every hour your team spends using monitoring tools or responding to what they reveal — should tie back directly to business outcomes: conversions, revenues, brand equity. If they don't? You might be missing the forest for the trees ...

June 22, 2022

Every day, companies are missing customer experience (CX) "red flags" because they don't have the tools to observe CX processes or metrics. Even basic errors or defects in automated customer interactions are left undetected for days, weeks or months, leading to widespread customer dissatisfaction. In fact, poor CX and digital technology investments are costing enterprises billions of dollars in lost potential revenue ...

June 21, 2022

Organizations are moving to microservices and cloud native architectures at an increasing pace. The primary incentive for these transformation projects is typically to increase the agility and velocity of software release and product innovation. These dynamic systems, however, are far more complex to manage and monitor, and they generate far higher data volumes ...

June 16, 2022

Global IT teams adapted to remote work in 2021, resolving employee tickets 23% faster than the year before as overall resolution time for IT tickets went down by 7 hours, according to the Freshservice Service Management Benchmark Report from Freshworks ...

June 15, 2022

Once upon a time data lived in the data center. Now data lives everywhere. All this signals the need for a new approach to data management, a next-gen solution ...

June 14, 2022

Findings from the 2022 State of Edge Messaging Report from Ably and Coleman Parkes Research show that most organizations (65%) that have built edge messaging capabilities in house have experienced an outage or significant downtime in the last 12-18 months. Most of the current in-house real-time messaging services aren't cutting it ...

June 13, 2022
Today's users want a complete digital experience when dealing with a software product or system. They are not content with the page load speeds or features alone but want the software to perform optimally in an omnichannel environment comprising multiple platforms, browsers, devices, and networks. This calls into question the role of load testing services to check whether the given software under testing can perform optimally when subjected to peak load ...