New Data Reveals Widespread Downtime and Security Risks in 99% of Enterprise Private Cloud Environments
February 08, 2017

Doron Pinhas
Continuity Software

Share this

Industrial and technological revolutions happen because new manufacturing systems or technologies make life easier, less expensive, more convenient, or more efficient. It's been that way in every epoch – but Continuity Software's new study indicates that in the cloud era, there's still work to be done.

With the rise of cloud technology in recent years, Continuity Software conducted an analysis of live enterprise private cloud environments – and the results are not at all reassuring. According to configuration data gathered from over 100 enterprise environments over the past year, the study found that there were widespread performance issues in 97% of them, putting the IT system at great risk for downtime. Ranked by the participating enterprises as the greatest concern, downtime risks were still present in each of the tested environments.

A deep dive into the study findings revealed numerous reasons for the increased operational risk in private cloud environments, ranging from lack of awareness to critical vendor recommendations, inconsistent configuration across virtual infrastructure components and incorrect alignment between different technology layers (such as virtual networks and physical resources, storage and compute layers, etc.).

The downtime risks were not specific to any particular configuration of hardware, software, or operating system. Indeed, the studied enterprises used a diverse technology stack: 48% of the organizations are pure Windows shops, compared to 7% of the organizations that run primarily Linux. 46% of the organizations use a mix of operating systems. Close to three quarters (73%) of the organizations use EMC data storage systems and 27% of the organizations use replication for automated offsite data protection. And 12% utilized active-active failover for continuous availability.

Certainly in the companies in question, the IT departments include top engineers and administrators – yet nearly all of the top companies included in the study have experienced some, and in a few cases many, issues.

While the results are unsettling, they are certainly not surprising. The modern IT environment is extremely complex and volatile: changes are made daily by multiple teams in a rapidly evolving technology landscape. With daily patching, upgrades, capacity expansion, etc., the slightest miscommunication between teams, or a knowledge gap could result in hidden risks to the stability of the IT environment.

Unlike legacy systems, in which standard testing and auditing practices are employed regularly (typically once or twice a year), private cloud infrastructure is not regularly tested. Interestingly, this fact is not always fully realized, even by seasoned IT experts. Virtual infrastructure is often designed to be "self-healing," using features such as virtual machine High Availability and workload mobility. Indeed, some evidence is regularly provided to demonstrate that they are working; after all, IT executives may argue, "not a week goes by with some virtual machines failing over successfully."

This perception of safety can be misleading, since a chain is only as strong as its weakest link; Simply put, it's a number game. Over the course of any given week, only a minute fraction of the virtual machines will actually be failed-over – usually less than 1%. What about the other 99%? Is it realistic to expect they're also fully protected?

The only way to determine the private cloud is truly resilient would be to prove every possible permutation of failure could be successfully averted. Of course, this could not be accomplished with manual processes, which would be much too time consuming, and potentially disruptive. The only sustainable and scalable approach would be to automate private cloud configuration validation and testing.

Individual vendors offer basic health measurements for their solution stack (for example, VMware, Microsoft, EMC and others). While useful, this is far from a real solution, since, as the study shows, the majority of the issues occur due to incorrect alignment between the different layers. In recent years, more holistic solutions have entered the market, that offer vendor agnostic, cross-domain validation.

While such approaches come with a cost, it is by far less expensive than the alternative cost of experiencing a critical outage. The cost of a single hour of downtime, according to multiple industry studies, can easily reach hundreds of thousands of dollars (and, in some verticals even millions).

Doron Pinhas is CTO of Continuity Software.

Share this

The Latest

July 18, 2024

As software development grows more intricate, the challenge for observability engineers tasked with ensuring optimal system performance becomes more daunting. Current methodologies are struggling to keep pace, with the annual Observability Pulse surveys indicating a rise in Mean Time to Remediation (MTTR). According to this survey, only a small fraction of organizations, around 10%, achieve full observability today. Generative AI, however, promises to significantly move the needle ...

July 17, 2024

While nearly all data leaders surveyed are building generative AI applications, most don't believe their data estate is actually prepared to support them, according to the State of Reliable AI report from Monte Carlo Data ...

July 16, 2024

Enterprises are putting a lot of effort into improving the digital employee experience (DEX), which has become essential to both improving organizational performance and attracting and retaining talented workers. But to date, most efforts to deliver outstanding DEX have focused on people working with laptops, PCs, or thin clients. Employees on the frontlines, using mobile devices to handle logistics ... have been largely overlooked ...

July 15, 2024

The average customer-facing incident takes nearly three hours to resolve (175 minutes) while the estimated cost of downtime is $4,537 per minute, meaning each incident can cost nearly $794,000, according to new research from PagerDuty ...

July 12, 2024

In MEAN TIME TO INSIGHT Episode 8, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses AutoCon with the conference founders Scott Robohn and Chris Grundemann ...

July 11, 2024

Numerous vendors and service providers have recently embraced the NaaS concept, yet there is still no industry consensus on its definition or the types of networks it involves. Furthermore, providers have varied in how they define the NaaS service delivery model. I conducted research for a new report, Network as a Service: Understanding the Cloud Consumption Model in Networking, to refine the concept of NaaS and reduce buyer confusion over what it is and how it can offer value ...

July 10, 2024

Containers are a common theme of wasted spend among organizations, according to the State of Cloud Costs 2024 report from Datadog. In fact, 83% of container costs were associated with idle resources ...

July 10, 2024

Companies prefer a mix of on-prem and cloud environments, according to the 2024 Global State of IT Automation Report from Stonebranch. In only one year, hybrid IT usage has doubled from 34% to 68% ...

July 09, 2024

At the forefront of this year's findings, from the Flexera 2024 State of ITAM Report, is the critical gap between software asset management (SAM) and FinOps (cloud financial management) teams. This year, 32% of SAM teams reported having significant interactions with FinOps teams. While this marks an improvement from last year's 25%, it highlights the persistent challenge of integrating these two essential functions ...

July 08, 2024

Information technology serves as the digital backbone for doctors, nurses, and technicians to deliver quality patient care by sharing data and applications over secure IT networks. To help understand the top IT trends that are impacting the healthcare industry today, Auvik recently released a companion analysis for its 2024 IT Trends Report ...