Complacency Kills Uptime in Virtualized Environments
April 10, 2018

Chris Adams
Park Place Technologies

Share this

Risk is relative. For example, studies have shown that wearing seatbelts can reduce highway safety, while more padding on hockey and American football players can increase injuries. It's called the Peltzman Effect and it describes how humans change behavior when risk factors are reduced. They often act more recklessly and drive risk right back up.

The phenomenon is recognized by many economists, its effects have been studied in the field of medicine, and I'd argue it is at the root of an interesting trend in IT — namely the increasing cost of downtime despite our more reliable virtualized environments.

Downtime Costs Are Rising

A study by the Ponemon Institute , for example, found the average cost of data center outages rose from $505,502 in 2010 to $740,357 in 2016. And the maximum cost was up 81% over the same time period, reaching over $2.4 million.

There are a lot of factors represented in these figures. For example, productivity losses are higher because labor costs are, and missed business opportunities are worth more today than they were several years ago. Yet advancements like virtual machines (VMs) with their continuous mirroring and seamless backups have not slashed downtime costs to the degree many IT pros had once predicted.

Have we as IT professionals dropped our defensive stance because we believe too strongly in the power of VMs and other technologies to save us? There are some signs that we have. For all the talk of cyberattacks—well deserved as it is—they cause only 10% of downtime. Hardware failures, on the other hand, account for 40%, according to Network Computing. And the Ponemon research referenced above found simple UPS problems to be at the root of one-quarter of outages.

Of course, VMs alone are not to blame, but it's worth looking at how downtime costs can increase when businesses rely on high-availability, virtually partitioned servers.

3 VM-Related Reasons for the Trend

The problem with VMs generally boils down to an "all eggs in one basket" problem. Separate workloads that would previously have run on multiple physical servers are consolidated to one server. Mirroring, automatic failover, and backups are intended to reduce risk associated with this single point of failure, but when these tactics fall through or complicated issues cascade, the resulting downtime can be especially costly for several reasons.

1. Utilization rates are higher

Work by McKinsey & Company and Gartner both pegged utilization rates for non-virtualized servers in the 6% to 12% range. With VMs, however, utilization typically approaches 30% and often stretches far higher. These busy servers are processing more workloads so downtime impacts are multiplied.

2. More customers are affected

Internal and external customers are accustomed to using VMs to share physical servers, so outages now affect a greater variety of workloads. This expands business consequences. A co-location provider could easily face irate calls and emails from dozens of clients, and a corporate data center manager could see complaints rise from the help desk to the C suite.

3. Complexity is prolonging downtime

Virtualization projects were supposed to simplify data centers but many have not, according to CIO Magazine. In their survey, respondents said they experience an average of 16 outages per year, 11 of which were caused by system failure resulting from complexity. And more complex systems are more difficult to troubleshoot and repair, making for longer downtime and higher overall costs.

Read Part 2: Solutions for Minimizing Server Downtime

Chris Adams is President and COO of Park Place Technologies
Share this

The Latest

July 16, 2020
eCommerce companies have entered into the domain of mobile applications given the huge number of customers using such apps on their smartphones on the go. However, these apps are vulnerable to both performance and security issues. Performance-wise, the apps may slow down while loading or transacting, give erroneous counts, become non-responsive across devices, and many more. So, the need of the hour for enterprises developing such applications is to invest in eCommerce performance testing ...
July 15, 2020

Digital Experience Monitoring is a tool that should be integrated with an organization's change management strategy. A key benefit of SaaS/cloud is no longer being responsible for software and hardware upgrades, maintenance, and patch cycles. Migrating to Microsoft Office 365 means no longer spending precious time and resources on Windows, Exchange or SharePoint upgrades for example. But that doesn't mean that IT can ignore changes or doesn't need to monitor for their effects ...

July 14, 2020

As systems become more complex and IT loses direct control of infrastructure (hello cloud), it becomes both more difficult and more important to capture and observe, holistically, the user experience. SaaS or cloud apps like Salesforce, Microsoft Office 365, and Workday have become mission-critical to most businesses and therefore need to be examined when it comes to experience monitoring ...

July 13, 2020

Newly distributed operations teams are struggling to cope with the sudden change to the WFH (work from home) concept. IT operations teams were traditionally set up to work from centralized locations, unlike software and engineering teams. Some organizations have overcome that by implementing AIOps solutions; others are using a brute force method of employing more IT operations analysts to keep the distributed NOCs going ...

July 09, 2020

Enterprises that halted their cloud migration journey during the current global pandemic are two and a half times more likely than those that continued their move to the cloud to have experienced IT outages that negatively impacted their SLAs, according to Virtana's latest survey report The Current State of Hybrid Cloud and IT ...

July 08, 2020

Every business has the responsibility to do their part against climate change by reducing their carbon footprint while increasing sustainability and efficiency. Harnessing optimization of IT infrastructure is one method companies can use to reduce carbon footprint, improve sustainability and increase business efficiency, while also keeping costs down ...

July 07, 2020

While the adoption of continuous integration (CI) is on the rise, software engineering teams are unable to take a zero-tolerance approach to software failures, costing enterprise organizations billions annually, according to a quantitative study conducted by Undo and a Cambridge Judge Business School MBA project ...

June 25, 2020

I've had the opportunity to work with a number of organizations embarking on their AIOps journey. I always advise them to start by evaluating their needs and the possibilities AIOps can bring to them through five different levels of AIOps maturity. This is a strategic approach that allows enterprises to achieve complete automation for long-term success ...

June 24, 2020

Sumo Logic recently commissioned an independent market research study to understand the industry momentum behind continuous intelligence — and the necessity for digital organizations to embrace a cloud-native, real-time continuous intelligence platform to support the speed and agility of business for faster decision-making, optimizing security, driving new innovation and delivering world-class customer experiences. Some of the key findings include ...

June 23, 2020

When it comes to viruses, it's typically those of the computer/digital variety that IT is concerned about. But with the ongoing pandemic, IT operations teams are on the hook to maintain business functions in the midst of rapid and massive change. One of the biggest challenges for businesses is the shift to remote work at scale. Ensuring that they can continue to provide products and services — and satisfy their customers — against this backdrop is challenging for many ...