Skip to main content

Q&A Part One: Insight from Forrester on APM

Pete Goldin
APMdigest

In APMdigest's exclusive interview, Jean-Pierre Garbani, Vice President, Principal Analyst at Forrester, offers guidance on selecting and managing APM solutions.

APM: For companies just starting out, what should be the first step on the road to APM?

JPG: Most of the time, the incident and problem management process starts with alerting, and then identifying where the problem might be. Traditional system management solutions are bringing you a lot of information about what is happening in your infrastructure. But there is no way to tell which application or which transaction a problem is related to. If you have a couple thousand servers, switches, databases and all that infrastructure, and someone tells you that a transaction doesn't work, and you do not know where to look to find the problem, you are not going to succeed.

At the end of the day the end-user sees a transaction, a specific transaction they are trying to perform. You need to find the relationship between the event in the infrastructure and the specific transaction that the end user wanted to perform. We recommend to monitor the end user experience, to look at whether the response time is normal. If it is abnormal, then something is going on, and you can base an alert on that.

Then you need to understand where to look, to identify where the problem might be. A mechanism for transaction tracing would reduce the number of elements you are looking at. If you look at only these elements, you have a good chance to find out which element is creating the problem. Once you have identified the elements, you can go deep down into that specific component and understand where the problem is, and correct it.

APM: In your view, are the alerting and identification and deep dive capabilities in the same solution, or are they different tools?

JPG: Some companies monitor everything, and some provide a component that can be used to build your APM solution. If your most critical problem is networks, you can start by buying a network monitoring product. And then next you realize maybe it's the code, and you go to AppDynamics, for example. And then you realize now you have all this information but you need to analyze it, so you buy Netuitive, and so on.

I do not want to tell people to buy a certain software because it does everything. It could be a complete solution from one vendor, it could be a partial solution from a vendor, it could be a buildout, it could be anything you want. It is just a matter of how you strategize. If you start building an infrastructure today and you need to monitor it, then maybe it is better to buy a complete solution like Quest, rather than build up your own solution. But if you already have 10 years of accumulated monitoring solutions, maybe it is better to start adding to your existing management system, rather than ripping and replacing.

APM: So most enterprises can leverage their current, or even legacy, monitoring and management technologies to build an APM system?

JPG: Of course. The goal is to build up, not to tear down what you have built so far. A good APM solution can use whatever exists as a foundation.

APM: In your Market Overview: Application Performance Management, Q4 2011 you said the IT organization often has many, and sometimes too many, monitoring solutions in place. What is the cause?

JPG: One thing that always worries me is that we often look at the IT organization as something that is stable. IT is definitely very unstable – or very “dynamic” if you wish. What you do today is temporary. Next year or two years down the road things will change.

I see an IT organization moving along a “complexity curve” which is exponential. You start building an infrastructure, and running a few applications, and then you move to more complex applications and more infrastructure. It starts to become too complicated for human beings to understand what is going on, so you look for something to help. But in the meantime, the complexity continues to grow and what you selected is no longer the right tool.

IT people are very technical, and they react to problems. They have to build infrastructure, and the infrastructure starts to grow, and they need something to tell them how this infrastructure is performing. So they buy a network monitoring solution. And then applications start to have performance issues, and they react by buying software that monitors the application. The person in charge of the network was looking at availability, now they want to look at performance, so they buy another product. All these people are buying products reactively, without any strategy in mind.

I recommend the creation of a strategy plan. I tell people to stop buying reactively. Start thinking about what you do best, what you do badly, what type of evolution path you are on, what is coming down the pike in the next two or three years, and how you can prepare for that. Are the products you are using today the right products for tomorrow?

APM: Would you say that the people buying these point solutions, these silo tools, are not seeing see the big picture?

JPG: They see the big picture. But they are in the moment, and reacting to it the best they can. The reason why they are on that path of buying tactically is because they do not have time. They are constantly pushed by the acceleration of IT. They are left only with the possibility to react, not the possibility to plan. They understand the necessity to plan, they just don't have time to do it.

APM: What is the solution?

JPG: They have to prioritize. They have to look at what is coming tomorrow, and build a scenario for tomorrow. We are trying to say to IT managers: we know you do not feel you have the time, but you have to find the time. Somehow you have to stop being in a hurry, because when you are in a hurry, you are making bad decisions.

I am a fan of Formula 1 racing. Jackie Stewart, a great Formula 1 driver, said that as long as things appear to be in slow motion, you are in control. What happens in IT is that things never appear to be in slow motion. People are always under the impression that everything must be done tomorrow, which is not true. You can take the time. And actually by taking the time, you are going to save time in the end. There is a mindset here that has to be changed.

APM: Do you see the separation of roles as another APM challenge? It seems like corporate nature and the nature of IT are opposed to each other. The corporation wants to separate roles, but IT does not work that way.

JPG: Exactly. Interestingly, we are structuring IT like a bureaucracy. And I am using that in the proper sense of the word, in terms of dividing tasks and creating a hierarchical system to manage people and their different tasks towards a common goal. That type of structure supposes stability, and IT is anything but stable.

I just talked to a customer that had to rebuild their IT organization. They outsourced everything for 12 years, and all of a sudden they decided to bring it back inside and rebuild the IT organization. It is interesting to see what issues they had. First, they had to rebuild their infrastructure. They hired engineers to build the network, data centers, etc. Now they are out of the building phase, and the roles are changing. The engineers are thrown into managing and it is a challenge for them. Companies need to stop separating infrastructure development from operations.

APM: It is one thing to talk about this conceptually, but would you say that people who get their minds around that concept, the CIOs that really understand that and break down the barriers, are the ones who will succeed?

JPG: Yes. I think a good CIO is someone who recognizes that IT has been changing very quickly these past few years. The CIO who succeeds is the one who tries to think ahead and prepare the organization to transform itself.

IT today does not look like the IT that I started with 44 years ago. And that transformation was relatively simple to manage until about 15 years ago. Even at the beginning of the 2000s it was still manageable, because we were still talking about hundreds of servers. Now we are talking about tens of thousands of servers, and in some enterprises like Google or Amazon we are talking millions of servers. That is mind boggling. How do you prepare for that? That is really the key to good management.

APM: In your recent TechRadar For I&O Professionals: IT Service Management Processes, Q1 2012 you referenced a disconnect between what the business considers critical and what IT considers critical. Are we making any progress?

JPG: We are making some progress, but it is a culture war. I have been in IT all my life, and I think that many of my IT colleagues believe we understand the technology while others do not. On the other hand, the business has a tendency to treat IT like janitors. We are not considered intelligent enough to understand their business, which is funny because we understand technology concepts that are very complex.

So there is a culture war between the two. It is very difficult. We need to reach a point where IT starts speaking the business language, and the business side starts to be more respectful of IT. The problem will resolve itself over time.

Click here to read Part Two of APMdigest's interview with Forrester's JP Garbani

The Latest
The Latest 10

The Latest

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

Many organizations assumed their infrastructure strategy was settled. It had been implemented, optimized and built into long-term plans. Recent changes in technology and vendor consolidation are forcing a second look. Cloud outages and licensing changes have exposed how much dependency exists on a small number of platforms. As a result, organizations are reevaluating whether those decisions still hold up under current conditions ...

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...

Q&A Part One: Insight from Forrester on APM

Pete Goldin
APMdigest

In APMdigest's exclusive interview, Jean-Pierre Garbani, Vice President, Principal Analyst at Forrester, offers guidance on selecting and managing APM solutions.

APM: For companies just starting out, what should be the first step on the road to APM?

JPG: Most of the time, the incident and problem management process starts with alerting, and then identifying where the problem might be. Traditional system management solutions are bringing you a lot of information about what is happening in your infrastructure. But there is no way to tell which application or which transaction a problem is related to. If you have a couple thousand servers, switches, databases and all that infrastructure, and someone tells you that a transaction doesn't work, and you do not know where to look to find the problem, you are not going to succeed.

At the end of the day the end-user sees a transaction, a specific transaction they are trying to perform. You need to find the relationship between the event in the infrastructure and the specific transaction that the end user wanted to perform. We recommend to monitor the end user experience, to look at whether the response time is normal. If it is abnormal, then something is going on, and you can base an alert on that.

Then you need to understand where to look, to identify where the problem might be. A mechanism for transaction tracing would reduce the number of elements you are looking at. If you look at only these elements, you have a good chance to find out which element is creating the problem. Once you have identified the elements, you can go deep down into that specific component and understand where the problem is, and correct it.

APM: In your view, are the alerting and identification and deep dive capabilities in the same solution, or are they different tools?

JPG: Some companies monitor everything, and some provide a component that can be used to build your APM solution. If your most critical problem is networks, you can start by buying a network monitoring product. And then next you realize maybe it's the code, and you go to AppDynamics, for example. And then you realize now you have all this information but you need to analyze it, so you buy Netuitive, and so on.

I do not want to tell people to buy a certain software because it does everything. It could be a complete solution from one vendor, it could be a partial solution from a vendor, it could be a buildout, it could be anything you want. It is just a matter of how you strategize. If you start building an infrastructure today and you need to monitor it, then maybe it is better to buy a complete solution like Quest, rather than build up your own solution. But if you already have 10 years of accumulated monitoring solutions, maybe it is better to start adding to your existing management system, rather than ripping and replacing.

APM: So most enterprises can leverage their current, or even legacy, monitoring and management technologies to build an APM system?

JPG: Of course. The goal is to build up, not to tear down what you have built so far. A good APM solution can use whatever exists as a foundation.

APM: In your Market Overview: Application Performance Management, Q4 2011 you said the IT organization often has many, and sometimes too many, monitoring solutions in place. What is the cause?

JPG: One thing that always worries me is that we often look at the IT organization as something that is stable. IT is definitely very unstable – or very “dynamic” if you wish. What you do today is temporary. Next year or two years down the road things will change.

I see an IT organization moving along a “complexity curve” which is exponential. You start building an infrastructure, and running a few applications, and then you move to more complex applications and more infrastructure. It starts to become too complicated for human beings to understand what is going on, so you look for something to help. But in the meantime, the complexity continues to grow and what you selected is no longer the right tool.

IT people are very technical, and they react to problems. They have to build infrastructure, and the infrastructure starts to grow, and they need something to tell them how this infrastructure is performing. So they buy a network monitoring solution. And then applications start to have performance issues, and they react by buying software that monitors the application. The person in charge of the network was looking at availability, now they want to look at performance, so they buy another product. All these people are buying products reactively, without any strategy in mind.

I recommend the creation of a strategy plan. I tell people to stop buying reactively. Start thinking about what you do best, what you do badly, what type of evolution path you are on, what is coming down the pike in the next two or three years, and how you can prepare for that. Are the products you are using today the right products for tomorrow?

APM: Would you say that the people buying these point solutions, these silo tools, are not seeing see the big picture?

JPG: They see the big picture. But they are in the moment, and reacting to it the best they can. The reason why they are on that path of buying tactically is because they do not have time. They are constantly pushed by the acceleration of IT. They are left only with the possibility to react, not the possibility to plan. They understand the necessity to plan, they just don't have time to do it.

APM: What is the solution?

JPG: They have to prioritize. They have to look at what is coming tomorrow, and build a scenario for tomorrow. We are trying to say to IT managers: we know you do not feel you have the time, but you have to find the time. Somehow you have to stop being in a hurry, because when you are in a hurry, you are making bad decisions.

I am a fan of Formula 1 racing. Jackie Stewart, a great Formula 1 driver, said that as long as things appear to be in slow motion, you are in control. What happens in IT is that things never appear to be in slow motion. People are always under the impression that everything must be done tomorrow, which is not true. You can take the time. And actually by taking the time, you are going to save time in the end. There is a mindset here that has to be changed.

APM: Do you see the separation of roles as another APM challenge? It seems like corporate nature and the nature of IT are opposed to each other. The corporation wants to separate roles, but IT does not work that way.

JPG: Exactly. Interestingly, we are structuring IT like a bureaucracy. And I am using that in the proper sense of the word, in terms of dividing tasks and creating a hierarchical system to manage people and their different tasks towards a common goal. That type of structure supposes stability, and IT is anything but stable.

I just talked to a customer that had to rebuild their IT organization. They outsourced everything for 12 years, and all of a sudden they decided to bring it back inside and rebuild the IT organization. It is interesting to see what issues they had. First, they had to rebuild their infrastructure. They hired engineers to build the network, data centers, etc. Now they are out of the building phase, and the roles are changing. The engineers are thrown into managing and it is a challenge for them. Companies need to stop separating infrastructure development from operations.

APM: It is one thing to talk about this conceptually, but would you say that people who get their minds around that concept, the CIOs that really understand that and break down the barriers, are the ones who will succeed?

JPG: Yes. I think a good CIO is someone who recognizes that IT has been changing very quickly these past few years. The CIO who succeeds is the one who tries to think ahead and prepare the organization to transform itself.

IT today does not look like the IT that I started with 44 years ago. And that transformation was relatively simple to manage until about 15 years ago. Even at the beginning of the 2000s it was still manageable, because we were still talking about hundreds of servers. Now we are talking about tens of thousands of servers, and in some enterprises like Google or Amazon we are talking millions of servers. That is mind boggling. How do you prepare for that? That is really the key to good management.

APM: In your recent TechRadar For I&O Professionals: IT Service Management Processes, Q1 2012 you referenced a disconnect between what the business considers critical and what IT considers critical. Are we making any progress?

JPG: We are making some progress, but it is a culture war. I have been in IT all my life, and I think that many of my IT colleagues believe we understand the technology while others do not. On the other hand, the business has a tendency to treat IT like janitors. We are not considered intelligent enough to understand their business, which is funny because we understand technology concepts that are very complex.

So there is a culture war between the two. It is very difficult. We need to reach a point where IT starts speaking the business language, and the business side starts to be more respectful of IT. The problem will resolve itself over time.

Click here to read Part Two of APMdigest's interview with Forrester's JP Garbani

The Latest
The Latest 10

The Latest

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

Many organizations assumed their infrastructure strategy was settled. It had been implemented, optimized and built into long-term plans. Recent changes in technology and vendor consolidation are forcing a second look. Cloud outages and licensing changes have exposed how much dependency exists on a small number of platforms. As a result, organizations are reevaluating whether those decisions still hold up under current conditions ...

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...