Optimizing Root Cause Analysis to Reduce MTTR
October 11, 2012

Ariel Gordon

Share this

Efficiently detecting and resolving problems is essential, of course, to continue supporting - and minimizing impact on - business services, as well as minimizing any financial impacts.

The goal is to turn the tables on IT problems so that 80 percent of the time is spent on the root cause analysis versus 20 percent on the actual problem fixing.

In resolving the issue, communication is a critical factor for integrating different expert groups towards a common goal. Because each team holds a narrow view of its own domain and expertise, there is always the danger lurking that the "big picture" angle will be missing. You don't want lack of communication to result in blame games and finger pointing.

Some problem detection methods include:

- Infrastructure Monitoring: specific resource utilization like disk, memory, CPU are effective for identifying availability failures – sometimes even heading those off before they happen.

- Domain or Application Tools: These help, but leave the issue that overall problem detection is still a game of hide-and-seek, a manually-intensive effort that comes under the pressure of needing a fix as quickly as possible.

- Dependency mapping tools, which map business services and applications to infrastructure components, can help you generate a topology map that will improve your root cause analysis process for the following reasons:

1. Connect Symptoms to Problems: A single map that relates a business service (user point of view) to its configuration items, will help you detect problems faster.

2. Common Ground: The map ties in all elements so that different groups can focus on a cross-domain effort.

3. High-Level, Cross-Domain View: Teams can view problems not only in the context of their domain, but in a wider view of all network components. For example, a database administrator analyzing a slow database performance problem can examine the topology map to see the effect of networking components on the database.

Root cause is a complex issue, so that no single tool or approach will provide you with full coverage. The idea is to plan a portfolio of tools that together deliver the most impact for your organization.

For instance, if you do not have a central event management console, then consider implementing a topology-based event management solution. If most of your applications involve online transactions, try to look for a transaction management product that covers the technology stack that is common in your environment. Put differently, select a combination of tools that are right for your environment.

Once you assess the tools that provide the most value, implement them in ascending order of value so that you get the biggest impact first.

Ariel Gordon is VP Products and Co-Founder of Neebula.

Share this

The Latest

April 25, 2024

The use of hybrid multicloud models is forecasted to double over the next one to three years as IT decision makers are facing new pressures to modernize IT infrastructures because of drivers like AI, security, and sustainability, according to the Enterprise Cloud Index (ECI) report from Nutanix ...

April 24, 2024

Over the last 20 years Digital Employee Experience has become a necessity for companies committed to digital transformation and improving IT experiences. In fact, by 2025, more than 50% of IT organizations will use digital employee experience to prioritize and measure digital initiative success ...

April 23, 2024

While most companies are now deploying cloud-based technologies, the 2024 Secure Cloud Networking Field Report from Aviatrix found that there is a silent struggle to maximize value from those investments. Many of the challenges organizations have faced over the past several years have evolved, but continue today ...

April 22, 2024

In our latest research, Cisco's The App Attention Index 2023: Beware the Application Generation, 62% of consumers report their expectations for digital experiences are far higher than they were two years ago, and 64% state they are less forgiving of poor digital services than they were just 12 months ago ...

April 19, 2024

In MEAN TIME TO INSIGHT Episode 5, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the network source of truth ...

April 18, 2024

A vast majority (89%) of organizations have rapidly expanded their technology in the past few years and three quarters (76%) say it's brought with it increased "chaos" that they have to manage, according to Situation Report 2024: Managing Technology Chaos from Software AG ...

April 17, 2024

In 2024 the number one challenge facing IT teams is a lack of skilled workers, and many are turning to automation as an answer, according to IT Trends: 2024 Industry Report ...

April 16, 2024

Organizations are continuing to embrace multicloud environments and cloud-native architectures to enable rapid transformation and deliver secure innovation. However, despite the speed, scale, and agility enabled by these modern cloud ecosystems, organizations are struggling to manage the explosion of data they create, according to The state of observability 2024: Overcoming complexity through AI-driven analytics and automation strategies, a report from Dynatrace ...

April 15, 2024

Organizations recognize the value of observability, but only 10% of them are actually practicing full observability of their applications and infrastructure. This is among the key findings from the recently completed Logz.io 2024 Observability Pulse Survey and Report ...

April 11, 2024

Businesses must adopt a comprehensive Internet Performance Monitoring (IPM) strategy, says Enterprise Management Associates (EMA), a leading IT analyst research firm. This strategy is crucial to bridge the significant observability gap within today's complex IT infrastructures. The recommendation is particularly timely, given that 99% of enterprises are expanding their use of the Internet as a primary connectivity conduit while facing challenges due to the inefficiency of multiple, disjointed monitoring tools, according to Modern Enterprises Must Boost Observability with Internet Performance Monitoring, a new report from EMA and Catchpoint ...