Skip to main content

APM, Observability and AIOps - a Way Forward

Ron Williams
Gigaom

What's coming in operations management tooling? In a nutshell, a shift from observability to intelligent operations and the longer-term move towards AI-enabled operations in support of the business, but application performance management (APM) still has a place.

Let's break these pieces down. First, APM could be perceived as becoming passé, in tooling terms. All larger companies use it, and tools vendors pull it into their observability suites. Companies still need APM as a starting point if they are unready for the integration heavy lifting, coordination between multiple departments, and political capital that more advanced solutions require.

Many vendors recognize this, selling APM at a reasonable cost with bundled access to other features — but there's a catch. Historically, APM licensing has been based on users, rather than data consumed. But now, vendors are using data as the driving factor for cost. The focus now is on data consumption models: If you're consuming a certain volume of logs, telemetry, and traces, these will drive your cost.

This means less predictability. If someone is temporarily consuming a lot of data, even legitimately (for example, for a new project), they'll have a blip in their billing. In addition, a user can say, "Oh, I can use this feature too," meaning they consume more data, which makes more money for vendors. APM is almost the gateway drug to observability, feature by feature.

Some companies make it easier for you to add another of their little tools because it's convenient. One company has 26 products — if you use one, you can access the others. Suddenly, finance goes, "Wait a minute, why do we suddenly have this big cost increase?" And you have to go back and look and realize, "Oh, George added this one, Sarah used that one, and Sam used the other one, and wow, our bill just quadrupled."

We're also seeing the rise of generative AI in Ops. Predictive AI and machine learning have long been in the mix, but this is the first year that genAI will appear in products. I expect every vendor will offer something related, but the offerings will almost universally be bad. It's not the vendors' fault, but nobody knows what we can, or should be doing with this capability. So vendors will include the feature, whether or not it's useful or really answers the questions businesses have.

For this reason, I'm updating one of my models. Historically, I have shown the evolution from monitoring to observability to awareness. This year, I'll change from monitoring to observability to intelligence. Under "intelligence" I have questions such as:

Is the business OK?

What was the result of last month's marketing campaign?

Sales has a new initiative; what will impact our services and support?

Unless you're in the business of IT, your real questions are not about IT but the business. If you fly people from point A to point B, you want to ask questions about that, not whether the revenue management system is working.

Observability didn't look to answer these questions, but now that we have more intelligence in tools, we must address them. You want to ask your chat interface that connects to your AIOps that question, rather than going over to revenue management and then going over to this group, that group, or the other group, for the answers.

These tools still have the same problems with AI: choosing the right algorithm at the right time, explainable AI, and AI bias — these are not going away. Let's say I train my AI on all my data … stop there, I don't have all my data because, for example, the guys over in desktop support didn't want to give me their data, but the guys over in networking did. I've trained the models on network data, and the AI now knows networking. So, what is every problem going to be? You guessed it, a networking problem.

Being able to train the AI and getting beyond its biases are going to be challenging. Additionally, generative AIs can hallucinate, presenting nonsense data as fact. Trusting AI as we train it to learn our businesses and help us run more efficiently is part of the new paradigm in business operations.

That'll set the scene for 2024: I expect them to have something, but it won't really help. It may be a little more focused in 2025, but by year three and on — that's when I really believe the AI they're putting into some of these tools will be truly useful. That is, it can answer questions about the condition of the enterprise, not the condition of IT.

That's the direction I see the industry taking, and I'm pushing to see how vendors will impact how the entire business operates. In three years, we should see the hype turn into real changes. For now, the nascent large language models show promise; but with planning and focus, generative AI won't be another promise broken.

Ron Williams is an Analyst at Gigaom

The Latest

Gartner identified the top data and analytics (D&A) trends for 2025 that are driving the emergence of a wide range of challenges, including organizational and human issues ...

Traditional network monitoring, while valuable, often falls short in providing the context needed to truly understand network behavior. This is where observability shines. In this blog, we'll compare and contrast traditional network monitoring and observability — highlighting the benefits of this evolving approach ...

A recent Rocket Software and Foundry study found that just 28% of organizations fully leverage their mainframe data, a concerning statistic given its critical role in powering AI models, predictive analytics, and informed decision-making ...

What kind of ROI is your organization seeing on its technology investments? If your answer is "it's complicated," you're not alone. According to a recent study conducted by Apptio ... there is a disconnect between enterprise technology spending and organizations' ability to measure the results ...

In today’s data and AI driven world, enterprises across industries are utilizing AI to invent new business models, reimagine business and achieve efficiency in operations. However, enterprises may face challenges like flawed or biased AI decisions, sensitive data breaches and rising regulatory risks ...

In MEAN TIME TO INSIGHT Episode 12, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses purchasing new network observability solutions.... 

There's an image problem with mobile app security. While it's critical for highly regulated industries like financial services, it is often overlooked in others. This usually comes down to development priorities, which typically fall into three categories: user experience, app performance, and app security. When dealing with finite resources such as time, shifting priorities, and team skill sets, engineering teams often have to prioritize one over the others. Usually, security is the odd man out ...

Image
Guardsquare

IT outages, caused by poor-quality software updates, are no longer rare incidents but rather frequent occurrences, directly impacting over half of US consumers. According to the 2024 Software Failure Sentiment Report from Harness, many now equate these failures to critical public health crises ...

In just a few months, Google will again head to Washington DC and meet with the government for a two-week remedy trial to cement the fate of what happens to Chrome and its search business in the face of ongoing antitrust court case(s). Or, Google may proactively decide to make changes, putting the power in its hands to outline a suitable remedy. Regardless of the outcome, one thing is sure: there will be far more implications for AI than just a shift in Google's Search business ... 

Image
Chrome

In today's fast-paced digital world, Application Performance Monitoring (APM) is crucial for maintaining the health of an organization's digital ecosystem. However, the complexities of modern IT environments, including distributed architectures, hybrid clouds, and dynamic workloads, present significant challenges ... This blog explores the challenges of implementing application performance monitoring (APM) and offers strategies for overcoming them ...

APM, Observability and AIOps - a Way Forward

Ron Williams
Gigaom

What's coming in operations management tooling? In a nutshell, a shift from observability to intelligent operations and the longer-term move towards AI-enabled operations in support of the business, but application performance management (APM) still has a place.

Let's break these pieces down. First, APM could be perceived as becoming passé, in tooling terms. All larger companies use it, and tools vendors pull it into their observability suites. Companies still need APM as a starting point if they are unready for the integration heavy lifting, coordination between multiple departments, and political capital that more advanced solutions require.

Many vendors recognize this, selling APM at a reasonable cost with bundled access to other features — but there's a catch. Historically, APM licensing has been based on users, rather than data consumed. But now, vendors are using data as the driving factor for cost. The focus now is on data consumption models: If you're consuming a certain volume of logs, telemetry, and traces, these will drive your cost.

This means less predictability. If someone is temporarily consuming a lot of data, even legitimately (for example, for a new project), they'll have a blip in their billing. In addition, a user can say, "Oh, I can use this feature too," meaning they consume more data, which makes more money for vendors. APM is almost the gateway drug to observability, feature by feature.

Some companies make it easier for you to add another of their little tools because it's convenient. One company has 26 products — if you use one, you can access the others. Suddenly, finance goes, "Wait a minute, why do we suddenly have this big cost increase?" And you have to go back and look and realize, "Oh, George added this one, Sarah used that one, and Sam used the other one, and wow, our bill just quadrupled."

We're also seeing the rise of generative AI in Ops. Predictive AI and machine learning have long been in the mix, but this is the first year that genAI will appear in products. I expect every vendor will offer something related, but the offerings will almost universally be bad. It's not the vendors' fault, but nobody knows what we can, or should be doing with this capability. So vendors will include the feature, whether or not it's useful or really answers the questions businesses have.

For this reason, I'm updating one of my models. Historically, I have shown the evolution from monitoring to observability to awareness. This year, I'll change from monitoring to observability to intelligence. Under "intelligence" I have questions such as:

Is the business OK?

What was the result of last month's marketing campaign?

Sales has a new initiative; what will impact our services and support?

Unless you're in the business of IT, your real questions are not about IT but the business. If you fly people from point A to point B, you want to ask questions about that, not whether the revenue management system is working.

Observability didn't look to answer these questions, but now that we have more intelligence in tools, we must address them. You want to ask your chat interface that connects to your AIOps that question, rather than going over to revenue management and then going over to this group, that group, or the other group, for the answers.

These tools still have the same problems with AI: choosing the right algorithm at the right time, explainable AI, and AI bias — these are not going away. Let's say I train my AI on all my data … stop there, I don't have all my data because, for example, the guys over in desktop support didn't want to give me their data, but the guys over in networking did. I've trained the models on network data, and the AI now knows networking. So, what is every problem going to be? You guessed it, a networking problem.

Being able to train the AI and getting beyond its biases are going to be challenging. Additionally, generative AIs can hallucinate, presenting nonsense data as fact. Trusting AI as we train it to learn our businesses and help us run more efficiently is part of the new paradigm in business operations.

That'll set the scene for 2024: I expect them to have something, but it won't really help. It may be a little more focused in 2025, but by year three and on — that's when I really believe the AI they're putting into some of these tools will be truly useful. That is, it can answer questions about the condition of the enterprise, not the condition of IT.

That's the direction I see the industry taking, and I'm pushing to see how vendors will impact how the entire business operates. In three years, we should see the hype turn into real changes. For now, the nascent large language models show promise; but with planning and focus, generative AI won't be another promise broken.

Ron Williams is an Analyst at Gigaom

The Latest

Gartner identified the top data and analytics (D&A) trends for 2025 that are driving the emergence of a wide range of challenges, including organizational and human issues ...

Traditional network monitoring, while valuable, often falls short in providing the context needed to truly understand network behavior. This is where observability shines. In this blog, we'll compare and contrast traditional network monitoring and observability — highlighting the benefits of this evolving approach ...

A recent Rocket Software and Foundry study found that just 28% of organizations fully leverage their mainframe data, a concerning statistic given its critical role in powering AI models, predictive analytics, and informed decision-making ...

What kind of ROI is your organization seeing on its technology investments? If your answer is "it's complicated," you're not alone. According to a recent study conducted by Apptio ... there is a disconnect between enterprise technology spending and organizations' ability to measure the results ...

In today’s data and AI driven world, enterprises across industries are utilizing AI to invent new business models, reimagine business and achieve efficiency in operations. However, enterprises may face challenges like flawed or biased AI decisions, sensitive data breaches and rising regulatory risks ...

In MEAN TIME TO INSIGHT Episode 12, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses purchasing new network observability solutions.... 

There's an image problem with mobile app security. While it's critical for highly regulated industries like financial services, it is often overlooked in others. This usually comes down to development priorities, which typically fall into three categories: user experience, app performance, and app security. When dealing with finite resources such as time, shifting priorities, and team skill sets, engineering teams often have to prioritize one over the others. Usually, security is the odd man out ...

Image
Guardsquare

IT outages, caused by poor-quality software updates, are no longer rare incidents but rather frequent occurrences, directly impacting over half of US consumers. According to the 2024 Software Failure Sentiment Report from Harness, many now equate these failures to critical public health crises ...

In just a few months, Google will again head to Washington DC and meet with the government for a two-week remedy trial to cement the fate of what happens to Chrome and its search business in the face of ongoing antitrust court case(s). Or, Google may proactively decide to make changes, putting the power in its hands to outline a suitable remedy. Regardless of the outcome, one thing is sure: there will be far more implications for AI than just a shift in Google's Search business ... 

Image
Chrome

In today's fast-paced digital world, Application Performance Monitoring (APM) is crucial for maintaining the health of an organization's digital ecosystem. However, the complexities of modern IT environments, including distributed architectures, hybrid clouds, and dynamic workloads, present significant challenges ... This blog explores the challenges of implementing application performance monitoring (APM) and offers strategies for overcoming them ...