As businesses have become increasingly reliant on technology, monitoring applications and infrastructure is a necessity. Monitoring is a key component of IT management, helping detect anomalies, triage issues, and ensure that the entire infrastructure is healthy.
However, despite their importance, monitoring tools are often an afterthought, deployed after an IT infrastructure is in place and functioning. Without a planned and well-defined monitoring strategy in place, most IT organizations – large and small – find themselves caught in the trap of "too many monitoring tools" – custom in-house tools, open source tools, packaged tools, and more, that add up over time for a variety of reasons.
A recent survey by EMA indicated that 65% of enterprise organizations have more than 10 monitoring tools. These monitoring tools are, of course, not all unnecessary, but the real question is: Does your team need to manage so many monitoring tools? Does every nail require a different hammer? What are the potential consequences?
There are many reasons why enterprises end up having too many monitoring tools. This blog will examine why this occurs, how the situation gets out of hand, and some best practices to consolidate monitoring in a way that benefits all functions and efficiencies across an IT organization.
Monitoring Sprawl: How Did We Get Here?
So, often, a single IT service relies on many technologies and tiers. For example, a web service requires a web server, multiple middleware tiers, plus message queues and databases. It is hosted on a virtualized server and relies on data access from a storage tier. And since each of these technology tiers is very different from the others, all require specialized management skills. IT organizations tend to be structured along the lines of these tiers, and so there are many administrators, each using a different set of tools for his/her domains of expertise.
Even within a specific tier, multiple monitoring tools may be in use: One for monitoring performance, another for analyzing log files, yet another to report on traffic to that tier, and so on.
Further, when an organization frequently relies on short-term solutions to diagnose problems, ad hoc tool choices can lead to further sprawl. That is, when faced with a problem, an IT administrator may implement a new tool simply to solve the specific issue at hand, never to be used again, thus contributing to a growing collection of monitoring tool shelfware that consumes costs and personnel resources.
Another reason for monitoring tool sprawl is simply personal experience with a particular software solution. IT administrators and managers may have used a monitoring tool in past roles that they view as required for the job. Despite having one or more existing monitoring tools in place, the new tool gets implemented, rendering the existing solutions partially or completely redundant.
Inheritance and Bundles
Mergers and acquisitions can add to the software sprawl. Every time two organizations merge, the combined organization inherits monitoring tools from both organizations.
Many hardware purchases include proprietary monitoring software. With almost every storage vendor bundling its own monitoring tool, an organization leveraging storage arrays from multiple vendors can easily end up with a diverse group of storage monitoring tools.
And, software vendors sometimes package monitoring tools with their enterprise environments as well, so organizations that enter into these agreements can find themselves with yet another tool.
SaaS-Based Monitoring Options & Freeware
With the advent of quick-to-deploy SaaS-based monitoring tools, it has become very easy for organizations to keep adding them. SaaS-based helpdesks, monitoring tools, security tools, and more, can be easily purchased from operating budgets, so IT staff members can simply deploy their own open source and free tools, as needed. All of these add up to the overall number of monitoring tools the organization must maintain.
The Problem of Too Many Tools
Needle in the Haystack
Although each monitoring tool offers its own unique focus and strengths, overlap in functionality is extremely common. And, because there is no integration between these tools, in today's environment of many tiers and many monitoring tools, problem diagnosis – perhaps the most critical factor in fast remediation – is tedious and time-consuming. Administrators must first sift through alerts from disparate sources, eliminate duplicates, and then manually correlate reported performance issues to get actionable insights. Further complicating this process, analyzing alerts across tiers often requires a great deal of expertise, potentially adding more resources and more time.
For fast remediation in a multi-tier service delivery, problem diagnosis must be centralized and automated, but this cannot be achieved easily with multiple tools. Finding the needle in the haystack is difficult, but with what appear to be duplicate needles across many haystacks, it is easy to be led astray and waste valuable resources and time.
Of War Rooms and Blame Games
Most monitoring tools are designed for specific subject-matter experts (application, database, network, VDI, etc.). Without unified visibility into the IT environment, war room discussions can easily turn into finger-pointing: An application owner blames the network tier for slowness, a database administrator blames developers that have not used optimal queries, virtualization administrators point to the storage team, and so on.
Everyone believes it is "not my problem." But there is a problem somewhere, and without a single source of truth – a holistic view of service performance – no one can have visibility into what went wrong and where the fix is needed. So, additional time and effort is needed to manually correlate events and solve the problem, while the business and users suffer.
Time and Money
Maintaining a sprawl of monitoring tools adds cost, on many levels. There are hard costs with license renewals and maintenance, plus the time spent in support requests, working with the various vendors, deploying upgrades, and training personnel to handle multiple tools. All impact the total cost of ownership of these tools, with the cost of maintaining shelfware and redundant tools being the most extravagant of them all.
US online holiday sales will increase 14.8 percent, totaling $124.1 billion in 2018, while offline retail spending is expected to increase a modest 2.7 percent, according to Adobe ...
Everyone talks about automating the software development lifecycle (SDLC) but the first question should be: What should you automate? With this question in mind, DEVOPSdigest asked experts from across the IT industry for their opinions on what steps in the SDLC should be automated ...
We all know artificial intelligence (AI) is a hot topic — but beyond the buzzword, have you ever wondered how IT departments are actually adopting AI technologies to improve on their operations? ...
How can IT teams focus on the critical events that can impact their business instead of wading through false positives? The emerging discipline of AIOps is a much-needed panacea for detecting patterns, identifying anomalies, and making sense of alerts across hybrid infrastructure ...
In a recent webinar AIOps and IT Analytics at the Crossroads, I was asked several times about the borderline between AIOps and monitoring tools — most particularly application performance monitoring (APM) capabilities. The general direction of the questions was — how are they different? Do you need AIOps if you have APM already? Why should I invest in both? ...
There's no place like the web and smartphones for the holidays. With the biggest shopping season of the year quickly approaching, retailers are gearing up to experience the most traffic their online platforms (web, mobile, IoT) have ever seen. To avoid missing out on millions this holiday season, below are the top five ways developers can keep their apps and websites up and running without a hitch ...
Usage data is multifaceted, with many diverse benefits. Harvesting usage-driven insights effectively requires both good foundational technology and a nimbleness of mind to unify insights across IT's many silos of domains and disciplines. Because of this, leveraging usage-driven insights can in itself become a catalyst for helping IT as a whole transform toward improved efficiencies and enhanced levels of business alignment ...
The requirements to maintain the complete availability and superior performance of your mission-critical workloads is a dynamic process that has never been more challenging. Here are five ways IT teams can measure and guarantee performance-based SLAs in order to increase the value of the infrastructure to the business, and ensure optimal digital performance levels ...
APMdigest asked experts from across the IT industry for their opinions on what IT departments should be monitoring to ensure digital performance. Part 5, the final installment, offers some recommendations you may not have thought about ...