Losing $$ Due to Ticket Times? Hack Response Time Using Data
June 03, 2016

Collin Firenze
Optanix

Share this

Without the proper expertise and tools in place to quickly isolate, diagnose, and resolve an incident, a quick routine error can result in hours of downtime – causing significant interruption in business operations that can impact both business revenue and employee productivity. How can we stop these little instances from turning into major fallouts? Major companies and organizations, take heed:

1. Identify the correlation between issues to expedite time to notify and time to resolve

Not understanding the correlation between issues is detrimental to timely resolutions. With a network monitoring solution in place, lack of automated correlation can generate excess "noise." This then requires support teams to act on numerous individualized alerts, rather than a single ticket that has all relevant events and information for the support end-user.

The correlated monitoring approach provides a holistic view into the network failure for support teams. Enabling support teams to analyze the network failure by utilizing the correlated events to efficiently identify the root cause will provide them the opportunity to promptly execute the corrective action to resolve the issue at hand.

Correlation consolidates all relevant information into a single ticket allowing support teams to largely reduce their staffing models, with only one support engineer needed to act on the incident as opposed to numerous resources engaging on individualized alerts.

2. Constantly analyzing raw data for trends helps IT teams proactively spot and prevent recurring issues

Aside from the standard reactive response of a support team, there is substantial benefit in the proactive analysis of raw data from your environment. By being proactive, trends and failures can be identified, followed by corrective and preventative actions taken to ensure support teams are not spending time investigating repeat issues. This approach not only creates a more stable environment with fewer failures, but also allows support teams to reduce manual hours and cost by avoiding "wasted" investigation on known and reoccurring issues.

Within a support organization, a Problem Management Group (PMG) is often implemented to fulfill the role of proactive analysis on raw data. In such instances, a PMG will create various scripts and calculation that will turn the raw data into a meaningful representation of the data set, to identify areas of concern such as:

■ Common types of failures

■ Failures within a specific region or location

■ Issues with a specific end-device type or model

■ Reoccurring issues at a specific time/day

■ Any trends in software or firmware revisions.

Once the raw data is analyzed by the PMG, the results can be relayed to the support team for review so a plan can be formalized to take the appropriate preventative action. The support team will work to present the data and their proposed solution, and seek approval to execute the corrective/preventative steps.

3. Present data in interactive dashboards and business intelligence reports to ensure proper understanding

Not every support team has the benefit of a PMG. In this specific circumstance, it's important that the system monitoring tools are fulfilling the role of the PMG analysis, and presenting the data in an easy-to-understand format for the end-user. From a tools perspective, the data analysis can be approached from both an interactive dashboard perspective, as well as through the use of business intelligence reports.

Interactive dashboards are a great way of presenting data in a format that caters to all audiences, from administrative and management level, and technical engineers. A combination of both graphs (i.e. pie charts, line graphs, etc.) and summarized metrics (i.e. Today, This Week, Last 30 days, etc.) are utilized to display the analyzed data, with the ability to filter capabilities to allow the end-user to view only desired information without the interference of all analyzed data which may not be applicable to their investigation.

In fact, a more "customizable" approach to raw data analysis would be a Business Intelligence Reporting Solution (BIRS). Essentially, the BIRS collects the raw data for the end-user, and provides drag and drop reporting, so that any desired data elements of interest can be incorporated into a customized on-demand report. What is particularly helpful for the user is the easy ability to save "filtering criteria" that would be beneficial to utilize repeatedly (i.e. Monthly Business Review Reports).

With routine errors, the main goal is to stay ahead of them by using data to identify correlations. Through effective event correlation, and by empowering teams with raw data, you can ensure that issues are quickly mitigated and don't pose the risk of impacting company ROI and system availability.

Collin Firenze is Associate Director at Optanix.

Share this

The Latest

April 19, 2018

In the course of researching, documenting and advising on user experience management needs and directions for more than a decade, I've found myself waging a quiet (and sometimes not so quiet) war with several industry assumptions. Chief among these is the notion that user experience management (UEM) is purely a subset of application performance management (APM). This APM-centricity misses some of UEM's most critical value points, and in a basic sense fails to recognize what UEM is truly about ...

April 18, 2018

We now live in the kind of connected world where established businesses that are not evolving digitally are in jeopardy of becoming extinct. New research shows companies are preparing to make digital transformation a priority in the near future. However most of them have a long way to go before achieving any kind of mastery over the multiple disciples required to effectively innovate ...

April 17, 2018

IT Transformation can result in bottom-line benefits that drive business differentiation, innovation and growth, according to new research conducted by Enterprise Strategy Group (ESG) ...

April 16, 2018

While regulatory compliance is an important activity for medium to large businesses, easy and cost-effective solutions can be difficult to find. Network visibility is an often overlooked, but critically important, activity that can help lower costs and make life easier for IT personnel that are responsible for these regulatory compliance solutions ...

April 12, 2018

This is the third in a series of three blogs directed at recent EMA research on the digital war room. In this blog, we'll look at three areas that have emerged in a spotlight in and of themselves — as signs of changing times — let alone as they may impact digital war room decision making. They are the growing focus on development and agile/DevOps; the impacts of cloud; and the growing need for security and operations (SecOps) to team more effectively ...

April 11, 2018

As we've seen, hardware is at the root of a large proportion of data center outages, and the costs and consequences are often exacerbated when VMs are affected. The best answer, therefore, is for IT pros to get back to basics ...

April 10, 2018

Risk is relative. The Peltzman Effect describes how humans change behavior when risk factors are reduced. They often act more recklessly and drive risk right back up. The phenomenon is recognized by many economists, its effects have been studied in the field of medicine, and I'd argue it is at the root of an interesting trend in IT — namely the increasing cost of downtime despite our more reliable virtualized environments ...

April 09, 2018

How do enterprises prepare for the future that our Cloud Vision 2020 survey forecasts? I see three immediate takeaways to focus on ...

April 06, 2018

When will we be at a point where virtually all enterprise workloads are run in the cloud and how will that change things for IT? To find out, we commissioned a survey, Cloud Vision 2020: The Future of the Cloud. The results were fascinating. I'll share three fundamental lessons we learned in the survey ...

April 05, 2018

The digital war room — physical, virtual or hybrid — is not in retreat but in fact is growing in scope to include greater participation from development and security. It's also becoming more proactive, with on average more than 30% of "major incidents" before they impacted business service performance. In this blog I'm providing a few additional highlights from the insights we got on digital war room organization and processes ...