Businesses Are Double-Invested in Monitoring – and Leaders Don't See It
September 29, 2022

Phil Tee
Moogsoft

Share this

Our digital economy is intolerant of downtime. But consumers haven't just come to expect always-on digital apps and services. They also expect continuous innovation, new functionality and lightening fast response times.


Organizations have taken note, investing heavily in teams and tools that supposedly increase uptime and free resources for innovation. But leaders have not realized this "throw money at the problem" approach to monitoring is burning through resources without much improvement in availability outcomes.

The Moogsoft State of Availability Report — which helps engineering teams and leaders uncover insights about availability KPIs, teams and tools — found that businesses are double-investing in monitoring. Organizations spend too much money on too many tools, and teams spend the majority of their days monitoring their monitoring tools.

This over-investment in incident management goes largely unnoticed by management. So does the fact that monitoring cycles siphon resources from the future-driven work that delights customers and keeps engineers engaged.

We identify a few common causes of the spend for less approach here:

1. Sprawling single-domain monitoring tools

In a noble attempt to keep digital apps and services available to end users at all times, business leaders buy tools that monitor their increasingly large and complex IT infrastructures. In theory, these tools should speed fixes to performance-affecting issues by continuously scanning systems and notifying engineers about anomalies.

The problem is: Teams have far too many tools. On average, engineers manage 16 monitoring tools. And that number can creep up to 40 as SLAs increase. Sprawling tools like this are unwieldy and license, management and maintenance overheads are expensive. But the over-investment in monitoring doesn't stop there.

2. Days spend in monitoring cycles

IT monitoring tools should bear the brunt of monitoring itself. In principle, these tools relieve engineers from spending too much time on a fairly tedious task and enable them to deliver what customers want: bigger and better technology.

Unfortunately, teams spend by far the most time monitoring over any other task. Why? Engineers spin their wheels managing single-domain tools that are not integrated cross stack. and produce huge volumes of largely useless data. Teams facing a critical outage or incident waste valuable time investigating data from disparate tools and connecting the dots themselves.

3. Leadership-team misalignment

Business leaders do not see just how much time their teams spend on monitoring, and likely believe they're making sound monitoring investments. Leaders believe their teams spend about the same amount of their time on monitoring as they do on other daily (and often future-driven) responsibilities like automation, cloud transformation and development.

4. Stalling innovation and experimentation

With engineering teams stuck in monitoring cycles, something has to give. And unfortunately, that thing is innovation and experimentation — the very activities that delight customers and engage engineering teams. In other words, not only do organizations over-invest in monitoring, they do so to the detriment of customer experience improvements.

The solution: steps to tech stability

If you are part of an engineering team or team leader, chances are you're facing modern-day monitoring problems. Consider these best practices for breaking wasteful monitoring cycles and building your tech stability:

1. Baseline your tools. Audit your existing tools, understand their utilization and what they cost. Then, you can determine which of these assets advance availability goals and which just create more noise.

2. Consolidate your tools. Hold on to only those monitoring tools that provide value. Otherwise, try to shrink your monitoring tools' footprint to decrease total cost of ownership (TCO) and reduce noise.

3. Implement an artificial intelligence for IT Operations (AIOps) solution. Make your next monitoring investment one that makes engineer's jobs less toilsome, not more. AIOps connects cloud and on-prem monitoring tools, giving engineers a central system of engagement for all monitoring activities. The platform alerts engineers to data anomalies and their root cause and automates the entire incident lifecycle.

4. Pay down your technical debt. With time back on your side, tackle the most relevant tech debt and increase system stability. Free even more time by automating away toil and continue to increase availability with chaos engineering.

5. Invest in the future. With time and money saved, refocus your investments on company-differentiating initiatives.

Monitoring tools are essential to uptime. But monitoring cannot be the only thing teams do — especially when it hinders innovation and experimentation. Leaders must make more informed investments to monitor more effectively. Only then can organizations move from maintaining the customer experience to innovating the customer experience.

Phil Tee is CEO of Moogsoft
Share this

The Latest

March 27, 2023

To achieve maximum availability, IT leaders must employ domain-agnostic solutions that identify and escalate issues across all telemetry points. These technologies, which we refer to as Artificial Intelligence for IT Operations, create convergence — in other words, they provide IT and DevOps teams with the full picture of event management and downtime ...

March 23, 2023

APMdigest and leading IT research firm Enterprise Management Associates (EMA) are partnering to bring you the EMA-APMdigest Podcast, a new podcast focused on the latest technologies impacting IT Operations. In Episode 2 - Part 1 Pete Goldin, Editor and Publisher of APMdigest, discusses Network Observability with Shamus McGillicuddy, Vice President of Research, Network Infrastructure and Operations, at EMA ...

March 22, 2023

CIOs have stepped into the role of digital leader and strategic advisor, according to the 2023 Global CIO Survey from Logicalis ...

March 21, 2023

Synthetic monitoring is crucial to deploy code with confidence as catching bugs with E2E tests on staging is becoming increasingly difficult. It isn't trivial to provide realistic staging systems, especially because today's apps are intertwined with many third-party APIs ...

March 20, 2023

Recent EMA field research found that ServiceOps is either an active effort or a formal initiative in 78% of the organizations represented by a global panel of 400+ IT leaders. It is relatively early but gaining momentum across industries and organizations of all sizes globally ...

March 16, 2023

Managing availability and performance within SAP environments has long been a challenge for IT teams. But as IT environments grow more complex and dynamic, and the speed of innovation in almost every industry continues to accelerate, this situation is becoming a whole lot worse ...

March 15, 2023

Harnessing the power of network-derived intelligence and insights is critical in detecting today's increasingly sophisticated security threats across hybrid and multi-cloud infrastructure, according to a new research study from IDC ...

March 14, 2023

Recent research suggests that many organizations are paying for more software than they need. If organizations are looking to reduce IT spend, leaders should take a closer look at the tools being offered to employees, as not all software is essential ...

March 13, 2023

Organizations are challenged by tool sprawl and data source overload, according to the Grafana Labs Observability Survey 2023, with 52% of respondents reporting that their companies use 6 or more observability tools, including 11% that use 16 or more.

March 09, 2023

An array of tools purport to maintain availability — the trick is sorting through the noise to find the right one. Let us discuss why availability is so important and then unpack the ROI of deploying Artificial Intelligence for IT Operations (AIOps) during an economic downturn ...