Application Performance Management (APM) has many benefits when implemented with the right support structure and sponsorship. It's the key for managing action, going red to green, and trending on performance.
As you strive to achieve new levels of sophistication when creating performance baselines, it is important to consider how you will navigate the oscillating winds of application behavior as the numbers come in from all directions. The behavioral context of the user will highlight key threshold settings to consider as you build a framework for real-time alerting into your APM solution.
This will take an understanding of the application and an analysis of the numbers as you begin looking at user patterns. Metrics play a key role in providing this value through different views across multiple comparisons. Absent from any behavioral learning engines which are now emerging in the APM space, you can begin a high level analysis on your own to come to a common understanding of each business application's performance.
Just as water seeks its own level, an application performance baseline will eventually emerge as you track the real-time performance metrics outlining the high and low watermarks of the application. This will include the occasional anomalous wave that comes crashing through affecting the user experience as the numbers fluctuate.
Depending on transaction volume and performance characteristics there will be a certain level of noise that you will need to squelch to a volume level that can be analyzed. When crunching the numbers and distilling patterns, it will be essential to create three baseline comparisons that you will use like a compass for navigation into what is real and what is an exception.
Real-Time vs. Yesterday
As the real-time performance metrics come in, it is important to watch the application performance at least at the five minute interval as compared to the day before to see if there are any obvious changes in performance.
Real-Time vs. 7 days Ago
Comparing Monday to Sunday may not be relevant if your core business hours are M-F; using the real-time view and comparing it to the same day as the previous week will be more useful - especially if a new release of the application was rolled out over the weekend and you want to know how it compares with the previous week.
Real-Time vs. 10 Day Rolling Average
Using a 10, 15 or 30 day rolling average is helpful in reviewing overall application performance with the business, because everyone can easily understand averages and what they mean when compared against a real-time view.
Capturing real-time performance metrics in five minute intervals is a good place to start. Once you get a better understanding of the application behavior you may increase or decrease the interval as needed. For real-time performance alerting, using the averages will give you a good picture when something is out of pattern, and to report on Service Level Management using percentiles (90%, 95%, etc.), will help create and accurate view for the business. To make it simple to remember, alert on the averages and profile with percentiles.
Conclusion
Operationally there are things you may not want to think about all of the time (e.g. standard deviations, averages, percentiles, etc.), but you have to think about them long enough to create the most accurate picture possible as you begin to distill performance patterns with each business application. This can be accomplished by building meaningful performance baselines that will help feed your Service Level Management processes well into the future.
You can contact Larry on LinkedIn.
Related Links:
For more information on the critical success factors in APM adoption and how this centers around the End-User-Experience (EUE), read The Anatomy of APM and the corresponding blog APM’s DNA – Event to Incident Flow.
Prioritizing Gartner's APM Model
The Latest
Developers need a tool that can be portable and vendor agnostic, given the advent of microservices. It may be clear an issue is occurring; what may not be clear is if it's part of a distributed system or the app itself. Enter OpenTelemetry, commonly referred to as OTel, an open-source framework that provides a standardized way of collecting and exporting telemetry data (logs, metrics, and traces) from cloud-native software ...
As SLOs grow in popularity their usage is becoming more mature. For example, 82% of respondents intend to increase their use of SLOs, and 96% have mapped SLOs directly to their business operations or already have a plan to, according to The State of Service Level Objectives 2023 from Nobl9 ...
Observability has matured beyond its early adopter position and is now foundational for modern enterprises to achieve full visibility into today's complex technology environments, according to The State of Observability 2023, a report released by Splunk in collaboration with Enterprise Strategy Group ...
Before network engineers even begin the automation process, they tend to start with preconceived notions that oftentimes, if acted upon, can hinder the process. To prevent that from happening, it's important to identify and dispel a few common misconceptions currently out there and how networking teams can overcome them. So, let's address the three most common network automation myths ...
Many IT organizations apply AI/ML and AIOps technology across domains, correlating insights from the various layers of IT infrastructure and operations. However, Enterprise Management Associates (EMA) has observed significant interest in applying these AI technologies narrowly to network management, according to a new research report, titled AI-Driven Networks: Leveling Up Network Management with AI/ML and AIOps ...
When it comes to system outages, AIOps solutions with the right foundation can help reduce the blame game so the right teams can spend valuable time restoring the impacted services rather than improving their MTTI score (mean time to innocence). In fact, much of today's innovation around ChatGPT-style algorithms can be used to significantly improve the triage process and user experience ...
Gartner identified the top 10 data and analytics (D&A) trends for 2023 that can guide D&A leaders to create new sources of value by anticipating change and transforming extreme uncertainty into new business opportunities ...
The only way for companies to stay competitive is to modernize applications, yet there's no denying that bringing apps into the modern era can be challenging ... Let's look at a few ways to modernize applications and consider what new obstacles and opportunities 2023 presents ...
As online penetration grows, retailers' profits are shrinking — with the cost of serving customers anytime, anywhere, at any speed not bringing in enough topline growth to best monetize even existing investments in technology, systems, infrastructure, and people, let alone new investments, according to Digital-First Retail: Turning Profit Destruction into Customer and Shareholder Value, a new report from AlixPartners and World Retail Congress ...