Chasing a Moving Target: APM in the Cloud - Part 2

Detection, Analysis and Action

February 21, 2013

Albert Mavashev

Learn more about MeshIQ

In my last blog, I discussed strategies for dealing with the complexities of monitoring performance in the various stacks that make up a cloud implementation. Here, we will look at ways to detect trends, analyze your data, and act on it.

The first requirement for detecting trends in application performance in the cloud is to have good information delivered in a timely manner about each stack as well as the application.

We acquire this information via data collectors that harvest all relevant indicators within the same clock tick. For example: response time, GC activity, memory usage, CPU Usage. Doing this within the same clock tick is called serialization. It is of little use to know I have a failed transaction at time X, but only have CPU and memory data from X minus 10 minutes.

Next, we require a history for each metric. This can be maintained in memory for near real-time analysis, but we also need to use slower storage for longer-term views.

Finally, we apply pattern matching to the data. We might scan and match metrics such as “find all applications whose GC is above High Bollinger Band for 2+ samples.” Doing this in memory can enable very fast detection across a large number of indicators.

Here are three steps you can use to detect performance trends

1. Measure the relevant application performance indicators on the business side such as orders filled, failed or missed. And then, measure the ones on the IT side such as JVM GC activity, memory, I/O rates.

2. Create a base line for each relevant indicator. This could a 1- to60-second sampling for near real-time monitoring. In addition set up a 1-, 10- and 15-minute sample or even daily, weekly or monthly for those longer in duration. You need both.

3. Apply analytics to determine trends and behavior

Keeping it Simple

Applying analytics can be easier than you expect. In fact, the more simple you keep it, the better.

The following three simple analytical techniques can be used in order to detect anomalies:

1. Bollinger Bands – 2 standard deviations off the mean – low and high. The normal is 2 standard deviations from the mean.

2. Percent of Change – This means comparing sample to sample, day to day or week to week, and calculating the percentage of change.

3. Velocity – Essentially this measures how fast indicators are changing. For example, you might be measuring response time and it drops from 10 to 20 seconds over a five-second interval or (20-10)/5 = 2 units/sec. With this technique, we are expecting a certain amount of change; however, when the amount of change is changing at an abnormal rate, we have most likely detected an anomaly.

Now That You Know ... Act On It

After the analysis, the next activity is to take action. This could be alerts, notification or system actions such as restarting processes or even resubmitting orders. Here, we are connecting the dots between IT and the business and alerting the appropriate owners.

And In Conclusion

Elastic cloud-based applications can’t be monitored effectively using static models, as these models assume constancy. And the one thing constant about these applications is their volatility. In these environments, what was abnormal yesterday might likely be normal today. As a result, what static models indicate may be wrong.

However, using a methodology comprised of gathering both business and IT metrics, creating automated base lines and applying analytics to them in real time can produce effective results and predict behavior.

Albert Mavashev is Chief Technology Officer at Nastel Technologies.

Hot Topics

APM

Cloud

The Latest

When AI Becomes the Corporate OS

June 17, 2026

AI is becoming the operating system of the enterprise. It acts as an invisible coordination layer that understands intent, connects systems, and executes work across complex SaaS environments. Previously, employees had to click through multiple systems — CRM, ERP, support tools, collaboration platforms — to complete a single task. Now, instead of navigating each application manually, they can simply state what they need to accomplish ...

The $600 Billion Wake Up Call

June 16, 2026

In 2026, the cost of downtime or an outage is no longer just a technical inconvenience; it's a $600 billion wake up call for global businesses. As our digital ecosystems become more interconnected, each touchpoint introduces new risks and multiplies the consequences when things go wrong. And the data is clear: aggregate downtime costs for Global 2,000 companies have surged 50% since 2024, reaching a staggering $600 billion ...

Breaking Down Agentic AI Fragmentation and Complexity with Governance

June 15, 2026

Deloitte found that 74% of enterprises expect to deploy agentic AI solutions in the next 24 months. However, the rush to deployment is outpacing foundational work, though. Only 21% of enterprises have fully formed agent governance models in place. The result? AI agents deployed without guidance or governance begin to function as fragmented islands of complexity ...

Cloud Spend Is Rising and Cloud Optimization Is Key to Funding AI and Protecting Margins

June 12, 2026

Cloud spending is no longer viewed as a passthrough IT expense, but as a strategic financial lever that directly impacts innovation capacity, profitability and enterprise resilience, according to the CFO Cloud Cost Optimization Report from Azul ...

Building AI Agents That Build Trust

June 11, 2026

As AI moves from generating responses to performing actions, the need for trust increases exponentially. And as organizations enlist AI agents for increasingly sophisticated business processes, trust is going to be the single most important theme for spurring adoption. What can organizations do to build trustworthy AI agents? ...

Keys to Building a Partner Ecosystem That Scales

June 10, 2026

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

How AI Agents Are Reshaping DataOps for the Always-On Enterprise

June 09, 2026

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

AI Deepfakes: Rethinking Trust in the Workplace

June 08, 2026

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Nine in Ten Enterprises Plan Cloud Data Repatriation amid Rising Cloud Costs and Data Sovereignty Mandates

June 05, 2026

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Why ITOps Need Right-Sized AI, Not Bigger Models

June 04, 2026

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

Chasing a Moving Target: APM in the Cloud - Part 2

Detection, Analysis and Action

February 21, 2013

Albert Mavashev

Learn more about MeshIQ

Next, we require a history for each metric. This can be maintained in memory for near real-time analysis, but we also need to use slower storage for longer-term views.

Here are three steps you can use to detect performance trends

3. Apply analytics to determine trends and behavior

Keeping it Simple

Applying analytics can be easier than you expect. In fact, the more simple you keep it, the better.

The following three simple analytical techniques can be used in order to detect anomalies:

1. Bollinger Bands – 2 standard deviations off the mean – low and high. The normal is 2 standard deviations from the mean.

2. Percent of Change – This means comparing sample to sample, day to day or week to week, and calculating the percentage of change.

Now That You Know ... Act On It

And In Conclusion

Albert Mavashev is Chief Technology Officer at Nastel Technologies.

Hot Topics

APM

Cloud

The Latest

When AI Becomes the Corporate OS

June 17, 2026

The $600 Billion Wake Up Call

June 16, 2026

Breaking Down Agentic AI Fragmentation and Complexity with Governance

June 15, 2026

Cloud Spend Is Rising and Cloud Optimization Is Key to Funding AI and Protecting Margins

June 12, 2026

Building AI Agents That Build Trust

June 11, 2026

Keys to Building a Partner Ecosystem That Scales

June 10, 2026

How AI Agents Are Reshaping DataOps for the Always-On Enterprise

June 09, 2026

AI Deepfakes: Rethinking Trust in the Workplace

June 08, 2026

Nine in Ten Enterprises Plan Cloud Data Repatriation amid Rising Cloud Costs and Data Sovereignty Mandates

June 05, 2026

Why ITOps Need Right-Sized AI, Not Bigger Models

June 04, 2026

Featured Free Trial

Featured Free Trial

Featured Free Trial

Featured Webinar

Featured Webinar

Featured Free Trial

Featured Free Trial

Featured Webinar

Featured White Paper

Featured Report

Featured Webinar

Featured White Paper

Featured White Paper

Featured Free Tool

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured eBook

Featured Free Trial

Featured Webinar

Featured Webinar

Featured Webinar

Featured White Paper

Featured Webinar

Featured eBook

Featured Webinar

Featured Webinar

Featured Webinar

Featured Report

Featured eBook

Featured Webinar

Featured Webinar

Featured Webinar

Featured Report

Featured White Paper

Featured Webinar

Featured eBook

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured White Paper

Featured eBook

Featured White Paper

Featured White Paper

Featured Webinar

Featured Webinar

Featured Webinar

Featured White Paper

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured eBook

Featured Webinar

Featured Report

Featured Webinar

Featured eBook

Featured Free Tool