Q&A Part One: Insight from Forrester on APM
February 28, 2012
Share this

In APMdigest's exclusive interview, Jean-Pierre Garbani, Vice President, Principal Analyst at Forrester, offers guidance on selecting and managing APM solutions.

APM: For companies just starting out, what should be the first step on the road to APM?

JPG: Most of the time, the incident and problem management process starts with alerting, and then identifying where the problem might be. Traditional system management solutions are bringing you a lot of information about what is happening in your infrastructure. But there is no way to tell which application or which transaction a problem is related to. If you have a couple thousand servers, switches, databases and all that infrastructure, and someone tells you that a transaction doesn't work, and you do not know where to look to find the problem, you are not going to succeed.

At the end of the day the end-user sees a transaction, a specific transaction they are trying to perform. You need to find the relationship between the event in the infrastructure and the specific transaction that the end user wanted to perform. We recommend to monitor the end user experience, to look at whether the response time is normal. If it is abnormal, then something is going on, and you can base an alert on that.

Then you need to understand where to look, to identify where the problem might be. A mechanism for transaction tracing would reduce the number of elements you are looking at. If you look at only these elements, you have a good chance to find out which element is creating the problem. Once you have identified the elements, you can go deep down into that specific component and understand where the problem is, and correct it.

APM: In your view, are the alerting and identification and deep dive capabilities in the same solution, or are they different tools?

JPG: Some companies monitor everything, and some provide a component that can be used to build your APM solution. If your most critical problem is networks, you can start by buying a network monitoring product. And then next you realize maybe it's the code, and you go to AppDynamics, for example. And then you realize now you have all this information but you need to analyze it, so you buy Netuitive, and so on.

I do not want to tell people to buy a certain software because it does everything. It could be a complete solution from one vendor, it could be a partial solution from a vendor, it could be a buildout, it could be anything you want. It is just a matter of how you strategize. If you start building an infrastructure today and you need to monitor it, then maybe it is better to buy a complete solution like Quest, rather than build up your own solution. But if you already have 10 years of accumulated monitoring solutions, maybe it is better to start adding to your existing management system, rather than ripping and replacing.

APM: So most enterprises can leverage their current, or even legacy, monitoring and management technologies to build an APM system?

JPG: Of course. The goal is to build up, not to tear down what you have built so far. A good APM solution can use whatever exists as a foundation.

APM: In your Market Overview: Application Performance Management, Q4 2011 you said the IT organization often has many, and sometimes too many, monitoring solutions in place. What is the cause?

JPG: One thing that always worries me is that we often look at the IT organization as something that is stable. IT is definitely very unstable – or very “dynamic” if you wish. What you do today is temporary. Next year or two years down the road things will change.

I see an IT organization moving along a “complexity curve” which is exponential. You start building an infrastructure, and running a few applications, and then you move to more complex applications and more infrastructure. It starts to become too complicated for human beings to understand what is going on, so you look for something to help. But in the meantime, the complexity continues to grow and what you selected is no longer the right tool.

IT people are very technical, and they react to problems. They have to build infrastructure, and the infrastructure starts to grow, and they need something to tell them how this infrastructure is performing. So they buy a network monitoring solution. And then applications start to have performance issues, and they react by buying software that monitors the application. The person in charge of the network was looking at availability, now they want to look at performance, so they buy another product. All these people are buying products reactively, without any strategy in mind.

I recommend the creation of a strategy plan. I tell people to stop buying reactively. Start thinking about what you do best, what you do badly, what type of evolution path you are on, what is coming down the pike in the next two or three years, and how you can prepare for that. Are the products you are using today the right products for tomorrow?

APM: Would you say that the people buying these point solutions, these silo tools, are not seeing see the big picture?

JPG: They see the big picture. But they are in the moment, and reacting to it the best they can. The reason why they are on that path of buying tactically is because they do not have time. They are constantly pushed by the acceleration of IT. They are left only with the possibility to react, not the possibility to plan. They understand the necessity to plan, they just don't have time to do it.

APM: What is the solution?

JPG: They have to prioritize. They have to look at what is coming tomorrow, and build a scenario for tomorrow. We are trying to say to IT managers: we know you do not feel you have the time, but you have to find the time. Somehow you have to stop being in a hurry, because when you are in a hurry, you are making bad decisions.

I am a fan of Formula 1 racing. Jackie Stewart, a great Formula 1 driver, said that as long as things appear to be in slow motion, you are in control. What happens in IT is that things never appear to be in slow motion. People are always under the impression that everything must be done tomorrow, which is not true. You can take the time. And actually by taking the time, you are going to save time in the end. There is a mindset here that has to be changed.

APM: Do you see the separation of roles as another APM challenge? It seems like corporate nature and the nature of IT are opposed to each other. The corporation wants to separate roles, but IT does not work that way.

JPG: Exactly. Interestingly, we are structuring IT like a bureaucracy. And I am using that in the proper sense of the word, in terms of dividing tasks and creating a hierarchical system to manage people and their different tasks towards a common goal. That type of structure supposes stability, and IT is anything but stable.

I just talked to a customer that had to rebuild their IT organization. They outsourced everything for 12 years, and all of a sudden they decided to bring it back inside and rebuild the IT organization. It is interesting to see what issues they had. First, they had to rebuild their infrastructure. They hired engineers to build the network, data centers, etc. Now they are out of the building phase, and the roles are changing. The engineers are thrown into managing and it is a challenge for them. Companies need to stop separating infrastructure development from operations.

APM: It is one thing to talk about this conceptually, but would you say that people who get their minds around that concept, the CIOs that really understand that and break down the barriers, are the ones who will succeed?

JPG: Yes. I think a good CIO is someone who recognizes that IT has been changing very quickly these past few years. The CIO who succeeds is the one who tries to think ahead and prepare the organization to transform itself.

IT today does not look like the IT that I started with 44 years ago. And that transformation was relatively simple to manage until about 15 years ago. Even at the beginning of the 2000s it was still manageable, because we were still talking about hundreds of servers. Now we are talking about tens of thousands of servers, and in some enterprises like Google or Amazon we are talking millions of servers. That is mind boggling. How do you prepare for that? That is really the key to good management.

APM: In your recent TechRadar For I&O Professionals: IT Service Management Processes, Q1 2012 you referenced a disconnect between what the business considers critical and what IT considers critical. Are we making any progress?

JPG: We are making some progress, but it is a culture war. I have been in IT all my life, and I think that many of my IT colleagues believe we understand the technology while others do not. On the other hand, the business has a tendency to treat IT like janitors. We are not considered intelligent enough to understand their business, which is funny because we understand technology concepts that are very complex.

So there is a culture war between the two. It is very difficult. We need to reach a point where IT starts speaking the business language, and the business side starts to be more respectful of IT. The problem will resolve itself over time.

Click here to read Part Two of APMdigest's interview with Forrester's JP Garbani

Share this

The Latest

July 25, 2024

The 2024 State of the Data Center Report from CoreSite shows that although C-suite confidence in the economy remains high, a VUCA (volatile, uncertain, complex, ambiguous) environment has many business leaders proceeding with caution when it comes to their IT and data ecosystems, with an emphasis on cost control and predictability, flexibility and risk management ...

July 24, 2024

In June, New Relic published the State of Observability for Energy and Utilities Report to share insights, analysis, and data on the impact of full-stack observability software in energy and utilities organizations' service capabilities. Here are eight key takeaways from the report ...

July 23, 2024

The rapid rise of generative AI (GenAI) has caught everyone's attention, leaving many to wonder if the technology's impact will live up to the immense hype. A recent survey by Alteryx provides valuable insights into the current state of GenAI adoption, revealing a shift from inflated expectations to tangible value realization across enterprises ... Here are five key takeaways that underscore GenAI's progression from hype to real-world impact ...

July 22, 2024
A defective software update caused what some experts are calling the largest IT outage in history on Friday, July 19. The impact reverberated through multiple industries around the world ...
July 18, 2024

As software development grows more intricate, the challenge for observability engineers tasked with ensuring optimal system performance becomes more daunting. Current methodologies are struggling to keep pace, with the annual Observability Pulse surveys indicating a rise in Mean Time to Remediation (MTTR). According to this survey, only a small fraction of organizations, around 10%, achieve full observability today. Generative AI, however, promises to significantly move the needle ...

July 17, 2024

While nearly all data leaders surveyed are building generative AI applications, most don't believe their data estate is actually prepared to support them, according to the State of Reliable AI report from Monte Carlo Data ...

July 16, 2024

Enterprises are putting a lot of effort into improving the digital employee experience (DEX), which has become essential to both improving organizational performance and attracting and retaining talented workers. But to date, most efforts to deliver outstanding DEX have focused on people working with laptops, PCs, or thin clients. Employees on the frontlines, using mobile devices to handle logistics ... have been largely overlooked ...

July 15, 2024

The average customer-facing incident takes nearly three hours to resolve (175 minutes) while the estimated cost of downtime is $4,537 per minute, meaning each incident can cost nearly $794,000, according to new research from PagerDuty ...

July 12, 2024

In MEAN TIME TO INSIGHT Episode 8, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses AutoCon with the conference founders Scott Robohn and Chris Grundemann ...

July 11, 2024

Numerous vendors and service providers have recently embraced the NaaS concept, yet there is still no industry consensus on its definition or the types of networks it involves. Furthermore, providers have varied in how they define the NaaS service delivery model. I conducted research for a new report, Network as a Service: Understanding the Cloud Consumption Model in Networking, to refine the concept of NaaS and reduce buyer confusion over what it is and how it can offer value ...