Skip to main content

Q&A Part One: Aberdeen Talks About APM

Pete Goldin
APMdigest

In APMdigest's exclusive interview, Jim Rapoza, Aberdeen Senior Research Analyst on IT Infrastructure, talks about the need for end-to-end APM, and the organizational changes required to make it happen.

APM: What exactly is “end-to-end APM” that you talk about in your September brief?

JR: Traditional performance management tends to be done in a siloed, walled approach. The application teams have their own tools, to ensure good performance and good quality in their apps. Then they throw it over a wall, and the network teams make sure the network has proper bandwidth and the servers are set up correctly, but they have no clue what the application team did. Just as the application team has no clue what the network team did. And once the app is up and running, a lot of the day-to-day analysis and management of the application ends up with the business stakeholders who are responsible for the app.

This traditional approach is also carried through the tools. Everybody uses different tool sets. They don't talk to each other. Nobody really knows what is happening. When a problem happens, the network team's dashboards may be fine because there is a problem somewhere else. For the application team, everything looks fine in the code and testing. But there could be a user experience issue - everyone's dashboards could be looking fine, but users are finding the app unusable.

So from the end-to-end perspective, you are trying to make sure you have visibility and control and management all the way to the end-user. Then knowing what is happening in your own internal network; in your data center and servers, whether they are cloud or private or hybrid; understanding the specific applications, and services that are tied to the app, and being able to see whether there are issues there. And even being able to see back into the back end of the datacenter, understanding the databases and storage, because problems can happen anywhere. It's not just in the network or the code, it can happen anywhere in the entire application ecosystem.

So the end-to-end approach is: One, making sure you can see and understand and have visibility and control over everything that touches application performance.

Two, it is also about making sure that all the different teams and stakeholders aren't just working together but actually understand what the other groups are saying and understand the data that is coming from those other groups. If the application team sends something over to the network team, and vice versa, it might as well be in a foreign language. So you need to have an end-to-end system that can tie everyone together.

APM: In terms of the organization, what do you see as the solution? Do they create an APM group that everybody belongs to?

JR: You could do that. I have actually seen those. The problem is that every organization is different. Some organizations have large teams; some organizations have one person doing applications, one person doing the network, and one person doing business management; and some organizations have one person doing all three. So I think it is more about having the translation layers.

Everybody wants to work together. I think those old divisions are kind of going away, because things happen too fast now. In the world of Cloud and agile development, you can't take weeks or months to address, to upgrade, to take care of. Everything must happen on a very quick schedule, so the solution can be any way you make all of those groups work together and understand each other. It can take different forms.

Some people would argue it needs to be a big unified platform that can see everything, and provide user configurable dashboards that take all the same data and put it in terms that each team can understand. That is definitely one type of solution.

Others would argue that you need to have better integration between the different systems, and they need to be able to talk the same language.

Obviously you can have an APM team, but that is an old approach. One of the problems is that the different teams don't talk the same language. Network teams are used to buying network tools - optimization, acceleration, cache. Application teams are used to buying testing and APM tools, and they don't ever think outside those boundaries. So you really need a translation layer. One way I put it in one of my blogs is that you need something like the translator bank at the UN. Similarly you need a translation layer so when monitoring and performance information is coming from the network or the back end, it can be put in terms that anyone who is monitoring application performance can understand.

APM: It sounds like the dashboard is key – a dashboard that would speak to all the different stakeholders?

JR: Definitely having a more powerful dashboard is important, but also a more configurable and more extensible dashboard too. You need to have the flexibility to get the information to where it needs to be.

APM: We talked about internal silos in IT. What about the gap between IT and business? Do you see that still persisting?

I think it is changing. Previously, business did not understand the technology. That gap between IT and business is now gone, because users are more sophisticated. The problem now is it has almost gone the other way. Users expect a level of functionality and application sophistication that they get in the other parts of their life.

So the disconnect now is that users have higher expectations for business applications. They use Gmail and Facebook, so they know what a good interface looks like, and if you are not providing them with that, if you are not providing the same experience, they are going to be unhappy.

What happens sometimes is that they work around your system, which leads to all kinds of problems, introducing compliance or security issues. And if people do work around your system, that means the time you've invested in developing that application is wasted.

APM: What do you see as the best way to bridge that gap?

I think IT needs to focus on the application and providing a comparable experience with the consumer applications. That is part of your application design. You have to make sure that you are building the app from a high usability perspective, and a high reliability and performance perspective as well. The tolerance for apps that don't open fast is really low. The tolerance for multiple clicks is really low. The answer is better design.

Read Part Two of the interview with Aberdeen's Jim Rapoza

Related Links:

Aberdeen Conducts 2013 Performance Management Survey

Aberdeen Report: The Need for End-to-End Application Performance Management and Monitoring

Jim Rapoza's Blog

Hot Topic
The Latest
The Latest 10

The Latest

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

Many organizations assumed their infrastructure strategy was settled. It had been implemented, optimized and built into long-term plans. Recent changes in technology and vendor consolidation are forcing a second look. Cloud outages and licensing changes have exposed how much dependency exists on a small number of platforms. As a result, organizations are reevaluating whether those decisions still hold up under current conditions ...

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...

Q&A Part One: Aberdeen Talks About APM

Pete Goldin
APMdigest

In APMdigest's exclusive interview, Jim Rapoza, Aberdeen Senior Research Analyst on IT Infrastructure, talks about the need for end-to-end APM, and the organizational changes required to make it happen.

APM: What exactly is “end-to-end APM” that you talk about in your September brief?

JR: Traditional performance management tends to be done in a siloed, walled approach. The application teams have their own tools, to ensure good performance and good quality in their apps. Then they throw it over a wall, and the network teams make sure the network has proper bandwidth and the servers are set up correctly, but they have no clue what the application team did. Just as the application team has no clue what the network team did. And once the app is up and running, a lot of the day-to-day analysis and management of the application ends up with the business stakeholders who are responsible for the app.

This traditional approach is also carried through the tools. Everybody uses different tool sets. They don't talk to each other. Nobody really knows what is happening. When a problem happens, the network team's dashboards may be fine because there is a problem somewhere else. For the application team, everything looks fine in the code and testing. But there could be a user experience issue - everyone's dashboards could be looking fine, but users are finding the app unusable.

So from the end-to-end perspective, you are trying to make sure you have visibility and control and management all the way to the end-user. Then knowing what is happening in your own internal network; in your data center and servers, whether they are cloud or private or hybrid; understanding the specific applications, and services that are tied to the app, and being able to see whether there are issues there. And even being able to see back into the back end of the datacenter, understanding the databases and storage, because problems can happen anywhere. It's not just in the network or the code, it can happen anywhere in the entire application ecosystem.

So the end-to-end approach is: One, making sure you can see and understand and have visibility and control over everything that touches application performance.

Two, it is also about making sure that all the different teams and stakeholders aren't just working together but actually understand what the other groups are saying and understand the data that is coming from those other groups. If the application team sends something over to the network team, and vice versa, it might as well be in a foreign language. So you need to have an end-to-end system that can tie everyone together.

APM: In terms of the organization, what do you see as the solution? Do they create an APM group that everybody belongs to?

JR: You could do that. I have actually seen those. The problem is that every organization is different. Some organizations have large teams; some organizations have one person doing applications, one person doing the network, and one person doing business management; and some organizations have one person doing all three. So I think it is more about having the translation layers.

Everybody wants to work together. I think those old divisions are kind of going away, because things happen too fast now. In the world of Cloud and agile development, you can't take weeks or months to address, to upgrade, to take care of. Everything must happen on a very quick schedule, so the solution can be any way you make all of those groups work together and understand each other. It can take different forms.

Some people would argue it needs to be a big unified platform that can see everything, and provide user configurable dashboards that take all the same data and put it in terms that each team can understand. That is definitely one type of solution.

Others would argue that you need to have better integration between the different systems, and they need to be able to talk the same language.

Obviously you can have an APM team, but that is an old approach. One of the problems is that the different teams don't talk the same language. Network teams are used to buying network tools - optimization, acceleration, cache. Application teams are used to buying testing and APM tools, and they don't ever think outside those boundaries. So you really need a translation layer. One way I put it in one of my blogs is that you need something like the translator bank at the UN. Similarly you need a translation layer so when monitoring and performance information is coming from the network or the back end, it can be put in terms that anyone who is monitoring application performance can understand.

APM: It sounds like the dashboard is key – a dashboard that would speak to all the different stakeholders?

JR: Definitely having a more powerful dashboard is important, but also a more configurable and more extensible dashboard too. You need to have the flexibility to get the information to where it needs to be.

APM: We talked about internal silos in IT. What about the gap between IT and business? Do you see that still persisting?

I think it is changing. Previously, business did not understand the technology. That gap between IT and business is now gone, because users are more sophisticated. The problem now is it has almost gone the other way. Users expect a level of functionality and application sophistication that they get in the other parts of their life.

So the disconnect now is that users have higher expectations for business applications. They use Gmail and Facebook, so they know what a good interface looks like, and if you are not providing them with that, if you are not providing the same experience, they are going to be unhappy.

What happens sometimes is that they work around your system, which leads to all kinds of problems, introducing compliance or security issues. And if people do work around your system, that means the time you've invested in developing that application is wasted.

APM: What do you see as the best way to bridge that gap?

I think IT needs to focus on the application and providing a comparable experience with the consumer applications. That is part of your application design. You have to make sure that you are building the app from a high usability perspective, and a high reliability and performance perspective as well. The tolerance for apps that don't open fast is really low. The tolerance for multiple clicks is really low. The answer is better design.

Read Part Two of the interview with Aberdeen's Jim Rapoza

Related Links:

Aberdeen Conducts 2013 Performance Management Survey

Aberdeen Report: The Need for End-to-End Application Performance Management and Monitoring

Jim Rapoza's Blog

Hot Topic
The Latest
The Latest 10

The Latest

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

Many organizations assumed their infrastructure strategy was settled. It had been implemented, optimized and built into long-term plans. Recent changes in technology and vendor consolidation are forcing a second look. Cloud outages and licensing changes have exposed how much dependency exists on a small number of platforms. As a result, organizations are reevaluating whether those decisions still hold up under current conditions ...

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...