Skip to main content

Widespread Downtime Found in 99 Percent of Cloud Environments

Downtime and security risks were present in each cloud environment tested, according to 2016 Private Cloud Resiliency Benchmarks, a report from Continuity Software.

The study also found that security and performance risks were found in 99 percent and 97 percent of the environments respectively, with 82 percent of the companies facing data loss risks.

Some of the top risks identified across the private cloud environments include:

■ Configuration drifts between cluster nodes that prevent failover. Examples for such discrepancies range from the most trivial – e.g., a file that is not accessible by all hosts in the cluster – to more complex ones – such as incorrect settings of affinity rules.

■ Virtual networking configuration errors leading to virtual machine isolation and downtime. Examples include incorrect Virtual Machine Port Group configurations and resources misalignment between ESXi cluster hosts leading to a single point of failure.

■ Incorrect storage settings leading to corrupt backups and data store loss. Such risks range from invalid CBT configuration to inconsistent LUN numbering and incorrect UUID settings.

What do these private cloud environments look like?

■ 48 percent of the organizations included in the study run their virtual machines on Windows compared to 7 percent of the organizations that run on Linux. 46 percent of the organizations use a mix of operating systems.

■ Close to three quarters (73 percent) of the organizations use EMC data storage systems. Other storage systems used include NetApp (38 percent), IBM (26 percent), HP (24 percent) and Hitachi (18 percent).

■ 27 percent of the organizations use replication for automated offsite data protection.

■ 12 percent of the organizations utilize active-active failover for continuous availability.

■ Almost all of the organizations (96 percent) use more than one physical path to transfer data between the host and the external storage device.

With a growing level of the complexity, increasing interdependence among infrastructure components, and an escalating pace of change, keeping cloud infrastructure free of risky misconfiguration is becoming a challenge that most organizations fail to meet.

"Sooner or later, every system fails," said Gil Hecht, CEO of Continuity Software. "And when a popular service goes down, it doesn't take long for customers to notice."

Each year enterprises continue to encounter downtime, which currently costs an estimated $740,000 per outage according to Ponemon's most recent report.

"The good news is that most risks lurking in the cloud infrastructure can be identified and corrected before they turn into a service disruption," explained Hecht. "This requires a specialized set of processes and tools, but above all a mindset and strategy focused on early detection and the remediation of risks."

Hot Topics

The Latest

From smart factories and autonomous vehicles to real-time analytics and intelligent building systems, the demand for instant, local data processing is exploding. To meet these needs, organizations are leaning into edge computing. The promise? Faster performance, reduced latency and less strain on centralized infrastructure. But there's a catch: Not every network is ready to support edge deployments ...

Every digital customer interaction, every cloud deployment, and every AI model depends on the same foundation: the ability to see, understand, and act on data in real time ... Recent data from Splunk confirms that 74% of the business leaders believe observability is essential to monitoring critical business processes, and 66% feel it's key to understanding user journeys. Because while the unknown is inevitable, observability makes it manageable. Let's explore why ...

Organizations that perform regular audits and assessments of AI system performance and compliance are over three times more likely to achieve high GenAI value than organizations that do not, according to a survey by Gartner ...

Kubernetes has become the backbone of cloud infrastructure, but it's also one of its biggest cost drivers. Recent research shows that 98% of senior IT leaders say Kubernetes now drives cloud spend, yet 91% still can't optimize it effectively. After years of adoption, most organizations have moved past discovery. They know container sprawl, idle resources and reactive scaling inflate costs. What they don't know is how to fix it ...

Artificial intelligence is no longer a future investment. It's already embedded in how we work — whether through copilots in productivity apps, real-time transcription tools in meetings, or machine learning models fueling analytics and personalization. But while enterprise adoption accelerates, there's one critical area many leaders have yet to examine: Can your network actually support AI at the speed your users expect? ...

The more technology businesses invest in, the more potential attack surfaces they have that can be exploited. Without the right continuity plans in place, the disruptions caused by these attacks can bring operations to a standstill and cause irreparable damage to an organization. It's essential to take the time now to ensure your business has the right tools, processes, and recovery initiatives in place to weather any type of IT disaster that comes up. Here are some effective strategies you can follow to achieve this ...

In today's fast-paced AI landscape, CIOs, IT leaders, and engineers are constantly challenged to manage increasingly complex and interconnected systems. The sheer scale and velocity of data generated by modern infrastructure can be overwhelming, making it difficult to maintain uptime, prevent outages, and create a seamless customer experience. This complexity is magnified by the industry's shift towards agentic AI ...

In MEAN TIME TO INSIGHT Episode 19, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA explains the cause of the AWS outage in October ... 

The explosion of generative AI and machine learning capabilities has fundamentally changed the conversation around cloud migration. It's no longer just about modernization or cost savings — it's about being able to compete in a market where AI is rapidly becoming table stakes. Companies that can't quickly spin up AI workloads, feed models with data at scale, or experiment with new capabilities are falling behind faster than ever before. But here's what I'm seeing: many organizations want to capitalize on AI, but they're stuck ...

On September 16, the world celebrated the 10th annual IT Pro Day, giving companies a chance to laud the professionals who serve as the backbone to almost every successful business across the globe. Despite the growing importance of their roles, many IT pros still work in the background and often go underappreciated ...

Widespread Downtime Found in 99 Percent of Cloud Environments

Downtime and security risks were present in each cloud environment tested, according to 2016 Private Cloud Resiliency Benchmarks, a report from Continuity Software.

The study also found that security and performance risks were found in 99 percent and 97 percent of the environments respectively, with 82 percent of the companies facing data loss risks.

Some of the top risks identified across the private cloud environments include:

■ Configuration drifts between cluster nodes that prevent failover. Examples for such discrepancies range from the most trivial – e.g., a file that is not accessible by all hosts in the cluster – to more complex ones – such as incorrect settings of affinity rules.

■ Virtual networking configuration errors leading to virtual machine isolation and downtime. Examples include incorrect Virtual Machine Port Group configurations and resources misalignment between ESXi cluster hosts leading to a single point of failure.

■ Incorrect storage settings leading to corrupt backups and data store loss. Such risks range from invalid CBT configuration to inconsistent LUN numbering and incorrect UUID settings.

What do these private cloud environments look like?

■ 48 percent of the organizations included in the study run their virtual machines on Windows compared to 7 percent of the organizations that run on Linux. 46 percent of the organizations use a mix of operating systems.

■ Close to three quarters (73 percent) of the organizations use EMC data storage systems. Other storage systems used include NetApp (38 percent), IBM (26 percent), HP (24 percent) and Hitachi (18 percent).

■ 27 percent of the organizations use replication for automated offsite data protection.

■ 12 percent of the organizations utilize active-active failover for continuous availability.

■ Almost all of the organizations (96 percent) use more than one physical path to transfer data between the host and the external storage device.

With a growing level of the complexity, increasing interdependence among infrastructure components, and an escalating pace of change, keeping cloud infrastructure free of risky misconfiguration is becoming a challenge that most organizations fail to meet.

"Sooner or later, every system fails," said Gil Hecht, CEO of Continuity Software. "And when a popular service goes down, it doesn't take long for customers to notice."

Each year enterprises continue to encounter downtime, which currently costs an estimated $740,000 per outage according to Ponemon's most recent report.

"The good news is that most risks lurking in the cloud infrastructure can be identified and corrected before they turn into a service disruption," explained Hecht. "This requires a specialized set of processes and tools, but above all a mindset and strategy focused on early detection and the remediation of risks."

Hot Topics

The Latest

From smart factories and autonomous vehicles to real-time analytics and intelligent building systems, the demand for instant, local data processing is exploding. To meet these needs, organizations are leaning into edge computing. The promise? Faster performance, reduced latency and less strain on centralized infrastructure. But there's a catch: Not every network is ready to support edge deployments ...

Every digital customer interaction, every cloud deployment, and every AI model depends on the same foundation: the ability to see, understand, and act on data in real time ... Recent data from Splunk confirms that 74% of the business leaders believe observability is essential to monitoring critical business processes, and 66% feel it's key to understanding user journeys. Because while the unknown is inevitable, observability makes it manageable. Let's explore why ...

Organizations that perform regular audits and assessments of AI system performance and compliance are over three times more likely to achieve high GenAI value than organizations that do not, according to a survey by Gartner ...

Kubernetes has become the backbone of cloud infrastructure, but it's also one of its biggest cost drivers. Recent research shows that 98% of senior IT leaders say Kubernetes now drives cloud spend, yet 91% still can't optimize it effectively. After years of adoption, most organizations have moved past discovery. They know container sprawl, idle resources and reactive scaling inflate costs. What they don't know is how to fix it ...

Artificial intelligence is no longer a future investment. It's already embedded in how we work — whether through copilots in productivity apps, real-time transcription tools in meetings, or machine learning models fueling analytics and personalization. But while enterprise adoption accelerates, there's one critical area many leaders have yet to examine: Can your network actually support AI at the speed your users expect? ...

The more technology businesses invest in, the more potential attack surfaces they have that can be exploited. Without the right continuity plans in place, the disruptions caused by these attacks can bring operations to a standstill and cause irreparable damage to an organization. It's essential to take the time now to ensure your business has the right tools, processes, and recovery initiatives in place to weather any type of IT disaster that comes up. Here are some effective strategies you can follow to achieve this ...

In today's fast-paced AI landscape, CIOs, IT leaders, and engineers are constantly challenged to manage increasingly complex and interconnected systems. The sheer scale and velocity of data generated by modern infrastructure can be overwhelming, making it difficult to maintain uptime, prevent outages, and create a seamless customer experience. This complexity is magnified by the industry's shift towards agentic AI ...

In MEAN TIME TO INSIGHT Episode 19, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA explains the cause of the AWS outage in October ... 

The explosion of generative AI and machine learning capabilities has fundamentally changed the conversation around cloud migration. It's no longer just about modernization or cost savings — it's about being able to compete in a market where AI is rapidly becoming table stakes. Companies that can't quickly spin up AI workloads, feed models with data at scale, or experiment with new capabilities are falling behind faster than ever before. But here's what I'm seeing: many organizations want to capitalize on AI, but they're stuck ...

On September 16, the world celebrated the 10th annual IT Pro Day, giving companies a chance to laud the professionals who serve as the backbone to almost every successful business across the globe. Despite the growing importance of their roles, many IT pros still work in the background and often go underappreciated ...