Skip to main content

Closing the Gap in Modern Tech and the Tools Meant to Monitor Them

Sean Sebring
SolarWinds

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue.

As today's organizations continue to lean on the latest technology to streamline workflows, they should simultaneously leverage the right AI tooling and internal coordination to develop a mature observability practice.

Why Has Monitoring and Observability Become More Complex?

Although the IT industry has long projected a collective move to the cloud, the reality is a little bit more complicated. According to the report, only 6% of organizations are completely on the cloud, 21% are on-premises, and 73% have a combination of the two. The data also shows that organizations' M&O strategies are not aligned with their IT architecture. For example, while 17% of organizations operate a hybrid IT environment, only 10% use hybrid IT M&O strategies. Similarly, while 32% of organizations are primarily cloud-based, only 9% of organizations leverage cloud-native or cloud-inclusive strategies. When organizations monitor their environments with M&O tooling foreign to those environments, it creates blind spots.

These blind spots can translate to cascading consequences for today's businesses. First, it suggests that cloud migration and the overall configuration of modern IT environments is outpacing M&O strategies. As a result, blind spots are created throughout an IT environment. This can lead to minor inconveniences — such as slower responses from important software — to large catastrophes such as major outages that cost hundreds of millions of dollars.

How Those Complexities Affect Your IT Team

While a disconnected M&O strategy can impact your IT systems, it can also cause workflow — and workload — problems for your IT team. More than half (55%) of the IT professionals surveyed said they have too many monitoring and observability tools. Disparate M&O tooling can increase alert fatigue and firefighting, causing IT personnel to burn out. In addition, a lack of team coordination can create observability obstacles. About 3 in 4 respondents reported a lack of coordination and cooperation between teams — such as network and infrastructure or apps and database teams — contributed to an observability challenge.

In addition to affecting systems and teams, multiple, disconnected M&O tools can increase the cost of a tech stack while reducing return on investment. Alternatively, a unified observability approach, one that's enhanced by proper AI use and internal upskilling, can increase ROI, decrease MTTR, and improve IT team morale.

AI's Role in Managing M&O Complexities

AI can bridge the gap where current observability tools struggle to keep pace with modern IT software. With proper AI use, teams can enhance diagnostics, automatically categorize alerts, and automate system responses to help engineers. These benefits are further amplified if AI is embedded in a unified M&O platform that can display diagnostics — in both on-prem, cloud, and hybrid environments — through a single pane of glass. This enhances visibility, decreases workload on IT personnel, and removes the need for multiple M&O tools.

In addition to diagnostic and alert prioritization, AI can also help in root cause analysis and predict system capacity or performance issues. This can dramatically decrease MTTx metrics such as mean time to acknowledge, detect, and resolve.

Now, it's important to note that while AI presents definitive advantages for M&O operations, AI adoption is not always a streamlined process. IT teams must get buy-in from top decision makers while also ensuring a safe and secure installation and use of AI technology. Respondents in the report cited security concerns, skills gaps, budget constraints, and regulatory or compliance limitations as barriers to AI adoption.

This is why it's important for IT teams to take three important steps before using AI in their M&O workflows:

1. Establish the change management role with AI: Identify where manual processes and outdated systems are holding M&O back. Communicate these issues to leadership and define exactly how AI and automation can address these challenges.

2. Begin with AI access control measures: Implement strict access controls for AI technology before bringing AI online. This will be especially important for industries that have a high level of compliance and regulatory requirements.

3. Prioritize upskilling: Oftentimes, security issues or negative effects from AI use come from internal mistakes. In addition, C-suite executives may be pushing back against AI because they simply don't know enough about the technology. Bring in experts who can educate the entire company on the benefits of AI in monitoring and observability. Also, conduct regular training sessions to establish a culture of responsible AI use.

A Gap Too Expensive to Widen

The gap between modern technology and current M&O strategies is a liability that is too costly not to address. Today's organizations are only set to move faster in the adoption of innovative technology, meaning the scale at which monitoring and observability must occur will only increase. If today's companies don't move fast, that gap will widen. If today's IT teams unify their observability practice, responsibly leverage AI, and properly educate their workforce, they can not only catch up to modern IT solutions — they can stay ahead of the curve. 

Sean Sebring is Solutions Engineering Manager at SolarWinds

The Latest

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...

40% of organizations deploying AI will implement dedicated AI observability tools by 2028 to monitor model performance, bias and outputs, according to Gartner ...

Until AI-powered engineering tools have live visibility of how code behaves at runtime, they cannot be trusted to autonomously ensure reliable systems, according to the State of AI-Powered Engineering Report 2026 report from Lightrun. The report reveals that a major volume of manual work is required when AI-generated code is deployed: 43% of AI-generated code requires manual debugging in production, even after passing QA or staging tests. Furthermore, an average of three manual redeploy cycles are required to verify a single AI-suggested code fix in production ...

Many organizations describe AI as strategic, but they do not manage it strategically. When AI plans are disconnected from strategy, detached from organizational learning, and protected from serious assumptions testing, the problem is no longer technical immaturity; it is a failure of management discipline ... Executives too often tell organizations to "use AI" before they define what AI is supposed to change. The problem deepens in organizations where strategy isn't well articulated in the first place ...

Across the enterprise technology landscape, a quiet crisis is playing out. Organizations have run hundreds, sometimes thousands, of generative AI pilots. Leadership has celebrated the proof of concept (POCs) ... Industry experience points to a sobering reality: only 5-10% of AI POCs that progress to the pilot stage successfully reach scaled production. The remaining 90% fail because the enterprise environment around them was never ready to absorb them, not the AI models ...

Today's modern systems are not what they once were. Organizations now rely on distributed systems, event-driven workflows, hybrid and multi-cloud environments and continuous delivery pipelines. While each adds flexibility, it also introduces new, often invisible failures. Development speed is no longer the primary bottleneck of innovation. Reliability is ...

Closing the Gap in Modern Tech and the Tools Meant to Monitor Them

Sean Sebring
SolarWinds

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue.

As today's organizations continue to lean on the latest technology to streamline workflows, they should simultaneously leverage the right AI tooling and internal coordination to develop a mature observability practice.

Why Has Monitoring and Observability Become More Complex?

Although the IT industry has long projected a collective move to the cloud, the reality is a little bit more complicated. According to the report, only 6% of organizations are completely on the cloud, 21% are on-premises, and 73% have a combination of the two. The data also shows that organizations' M&O strategies are not aligned with their IT architecture. For example, while 17% of organizations operate a hybrid IT environment, only 10% use hybrid IT M&O strategies. Similarly, while 32% of organizations are primarily cloud-based, only 9% of organizations leverage cloud-native or cloud-inclusive strategies. When organizations monitor their environments with M&O tooling foreign to those environments, it creates blind spots.

These blind spots can translate to cascading consequences for today's businesses. First, it suggests that cloud migration and the overall configuration of modern IT environments is outpacing M&O strategies. As a result, blind spots are created throughout an IT environment. This can lead to minor inconveniences — such as slower responses from important software — to large catastrophes such as major outages that cost hundreds of millions of dollars.

How Those Complexities Affect Your IT Team

While a disconnected M&O strategy can impact your IT systems, it can also cause workflow — and workload — problems for your IT team. More than half (55%) of the IT professionals surveyed said they have too many monitoring and observability tools. Disparate M&O tooling can increase alert fatigue and firefighting, causing IT personnel to burn out. In addition, a lack of team coordination can create observability obstacles. About 3 in 4 respondents reported a lack of coordination and cooperation between teams — such as network and infrastructure or apps and database teams — contributed to an observability challenge.

In addition to affecting systems and teams, multiple, disconnected M&O tools can increase the cost of a tech stack while reducing return on investment. Alternatively, a unified observability approach, one that's enhanced by proper AI use and internal upskilling, can increase ROI, decrease MTTR, and improve IT team morale.

AI's Role in Managing M&O Complexities

AI can bridge the gap where current observability tools struggle to keep pace with modern IT software. With proper AI use, teams can enhance diagnostics, automatically categorize alerts, and automate system responses to help engineers. These benefits are further amplified if AI is embedded in a unified M&O platform that can display diagnostics — in both on-prem, cloud, and hybrid environments — through a single pane of glass. This enhances visibility, decreases workload on IT personnel, and removes the need for multiple M&O tools.

In addition to diagnostic and alert prioritization, AI can also help in root cause analysis and predict system capacity or performance issues. This can dramatically decrease MTTx metrics such as mean time to acknowledge, detect, and resolve.

Now, it's important to note that while AI presents definitive advantages for M&O operations, AI adoption is not always a streamlined process. IT teams must get buy-in from top decision makers while also ensuring a safe and secure installation and use of AI technology. Respondents in the report cited security concerns, skills gaps, budget constraints, and regulatory or compliance limitations as barriers to AI adoption.

This is why it's important for IT teams to take three important steps before using AI in their M&O workflows:

1. Establish the change management role with AI: Identify where manual processes and outdated systems are holding M&O back. Communicate these issues to leadership and define exactly how AI and automation can address these challenges.

2. Begin with AI access control measures: Implement strict access controls for AI technology before bringing AI online. This will be especially important for industries that have a high level of compliance and regulatory requirements.

3. Prioritize upskilling: Oftentimes, security issues or negative effects from AI use come from internal mistakes. In addition, C-suite executives may be pushing back against AI because they simply don't know enough about the technology. Bring in experts who can educate the entire company on the benefits of AI in monitoring and observability. Also, conduct regular training sessions to establish a culture of responsible AI use.

A Gap Too Expensive to Widen

The gap between modern technology and current M&O strategies is a liability that is too costly not to address. Today's organizations are only set to move faster in the adoption of innovative technology, meaning the scale at which monitoring and observability must occur will only increase. If today's companies don't move fast, that gap will widen. If today's IT teams unify their observability practice, responsibly leverage AI, and properly educate their workforce, they can not only catch up to modern IT solutions — they can stay ahead of the curve. 

Sean Sebring is Solutions Engineering Manager at SolarWinds

The Latest

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...

40% of organizations deploying AI will implement dedicated AI observability tools by 2028 to monitor model performance, bias and outputs, according to Gartner ...

Until AI-powered engineering tools have live visibility of how code behaves at runtime, they cannot be trusted to autonomously ensure reliable systems, according to the State of AI-Powered Engineering Report 2026 report from Lightrun. The report reveals that a major volume of manual work is required when AI-generated code is deployed: 43% of AI-generated code requires manual debugging in production, even after passing QA or staging tests. Furthermore, an average of three manual redeploy cycles are required to verify a single AI-suggested code fix in production ...

Many organizations describe AI as strategic, but they do not manage it strategically. When AI plans are disconnected from strategy, detached from organizational learning, and protected from serious assumptions testing, the problem is no longer technical immaturity; it is a failure of management discipline ... Executives too often tell organizations to "use AI" before they define what AI is supposed to change. The problem deepens in organizations where strategy isn't well articulated in the first place ...

Across the enterprise technology landscape, a quiet crisis is playing out. Organizations have run hundreds, sometimes thousands, of generative AI pilots. Leadership has celebrated the proof of concept (POCs) ... Industry experience points to a sobering reality: only 5-10% of AI POCs that progress to the pilot stage successfully reach scaled production. The remaining 90% fail because the enterprise environment around them was never ready to absorb them, not the AI models ...

Today's modern systems are not what they once were. Organizations now rely on distributed systems, event-driven workflows, hybrid and multi-cloud environments and continuous delivery pipelines. While each adds flexibility, it also introduces new, often invisible failures. Development speed is no longer the primary bottleneck of innovation. Reliability is ...