As businesses have become increasingly reliant on technology, monitoring applications and infrastructure is a necessity. Monitoring is a key component of IT management, helping detect anomalies, triage issues, and ensure that the entire infrastructure is healthy.
However, despite their importance, monitoring tools are often an afterthought, deployed after an IT infrastructure is in place and functioning. Without a planned and well-defined monitoring strategy in place, most IT organizations – large and small – find themselves caught in the trap of "too many monitoring tools" – custom in-house tools, open source tools, packaged tools, and more, that add up over time for a variety of reasons.
A recent survey by EMA indicated that 65% of enterprise organizations have more than 10 monitoring tools. These monitoring tools are, of course, not all unnecessary, but the real question is: Does your team need to manage so many monitoring tools? Does every nail require a different hammer? What are the potential consequences?
There are many reasons why enterprises end up having too many monitoring tools. This blog will examine why this occurs, how the situation gets out of hand, and some best practices to consolidate monitoring in a way that benefits all functions and efficiencies across an IT organization.
Monitoring Sprawl: How Did We Get Here?
So, often, a single IT service relies on many technologies and tiers. For example, a web service requires a web server, multiple middleware tiers, plus message queues and databases. It is hosted on a virtualized server and relies on data access from a storage tier. And since each of these technology tiers is very different from the others, all require specialized management skills. IT organizations tend to be structured along the lines of these tiers, and so there are many administrators, each using a different set of tools for his/her domains of expertise.
Even within a specific tier, multiple monitoring tools may be in use: One for monitoring performance, another for analyzing log files, yet another to report on traffic to that tier, and so on.
Further, when an organization frequently relies on short-term solutions to diagnose problems, ad hoc tool choices can lead to further sprawl. That is, when faced with a problem, an IT administrator may implement a new tool simply to solve the specific issue at hand, never to be used again, thus contributing to a growing collection of monitoring tool shelfware that consumes costs and personnel resources.
Another reason for monitoring tool sprawl is simply personal experience with a particular software solution. IT administrators and managers may have used a monitoring tool in past roles that they view as required for the job. Despite having one or more existing monitoring tools in place, the new tool gets implemented, rendering the existing solutions partially or completely redundant.
Inheritance and Bundles
Mergers and acquisitions can add to the software sprawl. Every time two organizations merge, the combined organization inherits monitoring tools from both organizations.
Many hardware purchases include proprietary monitoring software. With almost every storage vendor bundling its own monitoring tool, an organization leveraging storage arrays from multiple vendors can easily end up with a diverse group of storage monitoring tools.
And, software vendors sometimes package monitoring tools with their enterprise environments as well, so organizations that enter into these agreements can find themselves with yet another tool.
SaaS-Based Monitoring Options & Freeware
With the advent of quick-to-deploy SaaS-based monitoring tools, it has become very easy for organizations to keep adding them. SaaS-based helpdesks, monitoring tools, security tools, and more, can be easily purchased from operating budgets, so IT staff members can simply deploy their own open source and free tools, as needed. All of these add up to the overall number of monitoring tools the organization must maintain.
The Problem of Too Many Tools
Needle in the Haystack
Although each monitoring tool offers its own unique focus and strengths, overlap in functionality is extremely common. And, because there is no integration between these tools, in today's environment of many tiers and many monitoring tools, problem diagnosis – perhaps the most critical factor in fast remediation – is tedious and time-consuming. Administrators must first sift through alerts from disparate sources, eliminate duplicates, and then manually correlate reported performance issues to get actionable insights. Further complicating this process, analyzing alerts across tiers often requires a great deal of expertise, potentially adding more resources and more time.
For fast remediation in a multi-tier service delivery, problem diagnosis must be centralized and automated, but this cannot be achieved easily with multiple tools. Finding the needle in the haystack is difficult, but with what appear to be duplicate needles across many haystacks, it is easy to be led astray and waste valuable resources and time.
Of War Rooms and Blame Games
Most monitoring tools are designed for specific subject-matter experts (application, database, network, VDI, etc.). Without unified visibility into the IT environment, war room discussions can easily turn into finger-pointing: An application owner blames the network tier for slowness, a database administrator blames developers that have not used optimal queries, virtualization administrators point to the storage team, and so on.
Everyone believes it is "not my problem." But there is a problem somewhere, and without a single source of truth – a holistic view of service performance – no one can have visibility into what went wrong and where the fix is needed. So, additional time and effort is needed to manually correlate events and solve the problem, while the business and users suffer.
Time and Money
Maintaining a sprawl of monitoring tools adds cost, on many levels. There are hard costs with license renewals and maintenance, plus the time spent in support requests, working with the various vendors, deploying upgrades, and training personnel to handle multiple tools. All impact the total cost of ownership of these tools, with the cost of maintaining shelfware and redundant tools being the most extravagant of them all.
Network performance issues come in all shapes and sizes, and can require vast amounts of time and resources to solve. Here are three examples of painful network performance issues you're likely to encounter this year, and how NPMD solutions can help you overcome them ...
"Scale up" versus "scale out" doesn't just apply to hardware investments, it also has an impact on product features. "Scale up" promotes buying the feature set you think you need now, then adding "feature modules" and licenses as you discover additional feature requirements are needed. Often as networks grow in size they also grow in complexity ...
Network Packet Brokers play a critical role in gaining visibility into new complex networks. They deliver the packet data and information IT and security teams need to identify problems, recognize security issues, and ensure overall network performance. However, not all Packet Brokers are created equal when it comes to scalability. Simply "scaling up" your network infrastructure at every growth point is a more complex and more expensive endeavor over time. Let's explore three ways the "scale up" approach to infrastructure growth impedes NetOps and security professionals (and the business as a whole) ...
Loyal users are the key to your service desk's success. Happy users want to use your services and they recommend your services in the organization. It takes time and effort to exceed user expectations, but doing so means keeping the promises we make to our users and being careful not to do too much without careful consideration for what's best for the organization and users ...
What's the difference between user satisfaction and user loyalty? How can you measure whether your users are satisfied and will keep buying from you? How much effort should you make to offer your users the ultimate experience? If you're a service provider, what matters in the end is whether users will keep coming back to you and will stay loyal ...
What if I said that a 95% reduction in the amount of IT noise, 99% reduction in ticket volume and 99% L1 resolution rate are not only possible, but that some of the largest, most complex enterprises in the world see these metrics in their environments every day, thanks to Artificial Intelligence (AI) and Machine Learning (ML)? Would you dismiss that as belonging to the realm of science fiction? ...
Of those surveyed, 96% of organizations have a digital transformation strategy, with 57% approaching it as an enterprise-wide priority, with a clear emphasis on speed of business, costs, risk, and customer satisfaction, according to IDC’s Aligning IT Strategies and Business Expectations for Digital Transformation Success, sponsored by EasyVista ...
One of my ongoing areas of focus is analytics, AIOps, and the intersection with AI and machine learning more broadly. Within this space, sad to say, semantic confusion surrounding just what these terms mean echoes the confusions surrounding ITSM ...
I must admit that the more I research IT service management (ITSM) — what it means, what its trends are, and what its future is — the more I feel like an explorer on a strikingly rich continent that so far the industry has either ignored, or misunderstood. The brand-new data that we've just received in EMA is a case in point ...