Complacency Kills Uptime in Virtualized Environments
April 10, 2018

Chris Adams
Park Place Technologies

Share this

Risk is relative. For example, studies have shown that wearing seatbelts can reduce highway safety, while more padding on hockey and American football players can increase injuries. It's called the Peltzman Effect and it describes how humans change behavior when risk factors are reduced. They often act more recklessly and drive risk right back up.

The phenomenon is recognized by many economists, its effects have been studied in the field of medicine, and I'd argue it is at the root of an interesting trend in IT — namely the increasing cost of downtime despite our more reliable virtualized environments.

Downtime Costs Are Rising

A study by the Ponemon Institute , for example, found the average cost of data center outages rose from $505,502 in 2010 to $740,357 in 2016. And the maximum cost was up 81% over the same time period, reaching over $2.4 million.

There are a lot of factors represented in these figures. For example, productivity losses are higher because labor costs are, and missed business opportunities are worth more today than they were several years ago. Yet advancements like virtual machines (VMs) with their continuous mirroring and seamless backups have not slashed downtime costs to the degree many IT pros had once predicted.

Have we as IT professionals dropped our defensive stance because we believe too strongly in the power of VMs and other technologies to save us? There are some signs that we have. For all the talk of cyberattacks—well deserved as it is—they cause only 10% of downtime. Hardware failures, on the other hand, account for 40%, according to Network Computing. And the Ponemon research referenced above found simple UPS problems to be at the root of one-quarter of outages.

Of course, VMs alone are not to blame, but it's worth looking at how downtime costs can increase when businesses rely on high-availability, virtually partitioned servers.

3 VM-Related Reasons for the Trend

The problem with VMs generally boils down to an "all eggs in one basket" problem. Separate workloads that would previously have run on multiple physical servers are consolidated to one server. Mirroring, automatic failover, and backups are intended to reduce risk associated with this single point of failure, but when these tactics fall through or complicated issues cascade, the resulting downtime can be especially costly for several reasons.

1. Utilization rates are higher

Work by McKinsey & Company and Gartner both pegged utilization rates for non-virtualized servers in the 6% to 12% range. With VMs, however, utilization typically approaches 30% and often stretches far higher. These busy servers are processing more workloads so downtime impacts are multiplied.

2. More customers are affected

Internal and external customers are accustomed to using VMs to share physical servers, so outages now affect a greater variety of workloads. This expands business consequences. A co-location provider could easily face irate calls and emails from dozens of clients, and a corporate data center manager could see complaints rise from the help desk to the C suite.

3. Complexity is prolonging downtime

Virtualization projects were supposed to simplify data centers but many have not, according to CIO Magazine. In their survey, respondents said they experience an average of 16 outages per year, 11 of which were caused by system failure resulting from complexity. And more complex systems are more difficult to troubleshoot and repair, making for longer downtime and higher overall costs.

Read Part 2: Solutions for Minimizing Server Downtime

Chris Adams is President and COO of Park Place Technologies
Share this

The Latest

March 25, 2019

Data-driven applications are helping drive cloud growth, according to a survey by Unravel Data. The data also reveals that enterprises are most concerned with a lack of sufficient technical talent to properly manage these data systems as well as the perceived high cost of deploying a modern data infrastructure ...

March 21, 2019

Achieving audit compliance within your IT ecosystem can be an iterative process, and it doesn't have to be compressed into the five days before the audit is due. Following is a four-step process I use to guide clients through the process of preparing for and successfully completing IT audits ...

March 20, 2019

Network performance issues come in all shapes and sizes, and can require vast amounts of time and resources to solve. Here are three examples of painful network performance issues you're likely to encounter this year, and how NPMD solutions can help you overcome them ...

March 19, 2019

"Scale up" versus "scale out" doesn't just apply to hardware investments, it also has an impact on product features. "Scale up" promotes buying the feature set you think you need now, then adding "feature modules" and licenses as you discover additional feature requirements are needed. Often as networks grow in size they also grow in complexity ...

March 18, 2019

Network Packet Brokers play a critical role in gaining visibility into new complex networks. They deliver the packet data and information IT and security teams need to identify problems, recognize security issues, and ensure overall network performance. However, not all Packet Brokers are created equal when it comes to scalability. Simply "scaling up" your network infrastructure at every growth point is a more complex and more expensive endeavor over time. Let's explore three ways the "scale up" approach to infrastructure growth impedes NetOps and security professionals (and the business as a whole) ...

March 15, 2019

Loyal users are the key to your service desk's success. Happy users want to use your services and they recommend your services in the organization. It takes time and effort to exceed user expectations, but doing so means keeping the promises we make to our users and being careful not to do too much without careful consideration for what's best for the organization and users ...

March 14, 2019

What's the difference between user satisfaction and user loyalty? How can you measure whether your users are satisfied and will keep buying from you? How much effort should you make to offer your users the ultimate experience? If you're a service provider, what matters in the end is whether users will keep coming back to you and will stay loyal ...

March 13, 2019

What if I said that a 95% reduction in the amount of IT noise, 99% reduction in ticket volume and 99% L1 resolution rate are not only possible, but that some of the largest, most complex enterprises in the world see these metrics in their environments every day, thanks to Artificial Intelligence (AI) and Machine Learning (ML)? Would you dismiss that as belonging to the realm of science fiction? ...

March 12, 2019
As a consumer, when you order products online, how do you expect them to get delivered? Some key requirements are: the product must arrive on time, well-packed, and ultimately must give you an easy gateway to return it if it is not as per your expectations. All this has been made possible via a single application. But what if this application doesn't function the way you want or cracks down mid-way, or probably leaks off information about you to some potential hackers? Technical uncertainty and digital chaos are the two double-edged swords dangling over this billion-dollar ecommerce market. Can Quality Assurance and Software Testing save application developers from this endless juggle? ...
March 11, 2019

Of those surveyed, 96% of organizations have a digital transformation strategy, with 57% approaching it as an enterprise-wide priority, with a clear emphasis on speed of business, costs, risk, and customer satisfaction, according to IDC’s Aligning IT Strategies and Business Expectations for Digital Transformation Success, sponsored by EasyVista ...