Skip to main content

98% of Cloud Deployments Experience Performance Issues

Jerry Melnick

A full 98% of cloud deployments experience some type of performance issue every year, according to a new survey by SIOS Technology, in partnership with ActualTech Media Research.

The survey was designed to understand current challenges and trends related to the state of performance and high availability for mission-critical applications in small, medium and large companies. A total of 390 IT professionals and decision-makers responded, collectively representing a cross-section those responsible for managing databases, infrastructure, architecture, cloud services and software development. Tier-1 applications explicitly identified include Oracle, Microsoft SQL Server and SAP/HANA.

There are some clear trends and a few surprises that we didn't see coming, and that might surprise you as well:

■ Small companies are leading the way to the public cloud with 54% planning to move more than half their mission-critical applications there by the end of 2018, which compares to 42% of large companies

■ For companies of all sizes, having complete control over the application environment was cited by 60% of the respondents as a key reason for why their mission-critical workloads remain on premises

■ Most (86%) organizations are using some form of failover clustering or other high availability mechanism for their mission-critical applications

■ Almost as many (95%) report having experienced a failure in their failover provisions

It's evident that organizations are finally moving their critical applications to the cloud, and at a greater pace than we could have imagined a few years ago. But they're still in the early days of adoption, placing mature operations a few years away. Here are some more details.

Misery Loves Company

A mere 2% of respondents claimed they never experience any application performance issues that ever affect any end users. The rest of us mere mortals claim to experience such issues, on average, daily (18%), 2-3 times per week (17%), once per week (10%), 2-3 times per month (15%), once per month (11%), 3-5 times per year (18%) or only once per year (8%).

The responses were reasonably consistent among Decision Makers, IT Staff, and Data & Development Staff with one notable exception: Decision Makers perceive a lower occurrence of performance issues than staff does. Nearly half (46%) of Decision Makers responded that performance issues occur 3-5 times per year or less (compared to 23-25% for staff), and only 11% responded that issues occur daily (compared to 20-21% for staff).

Rapid Response to the Rescue

One possible explanation for this apparent discrepancy is IT Staff being made aware of problems affecting performance with an automated alert, followed by a rapid response to find and fix the cause.

When asked about high availability provisions failing (something that is certain to affect performance!), 77% learn of the problem via an alert from monitoring tools, while 39% learn from a user complaint. (Note that multiple responses were permitted.)

As for remediation, it takes more than 5 hours to fix a problem only 3% of the time. Nearly a quarter (23%) are fixed in less than an hour, over half (56%) are fixed in 1-3 hours, and 18% are fixed in 3-5 hours. Small companies are able to resolve problems more quickly (31% in less than an hour) than large ones (only 11% in less than an hour), likely because the former utilizes the public cloud more extensively and has less complex configurations.

Culprits in the Cloud

When asked about the cause of performance issues that arise in the cloud, the main culprits are the application or the database being used, which together accounted for 64% of the issues. It is important to note that this question did not distinguish between who is responsible for the managing the application and/or database, which would likely be the cloud service provider for a managed service. Additional causes include issues with the service provider (17%) or the infrastructure (15%). In 4% of the cases, the issue remained a mystery.

Hot Topics

The Latest

In live financial environments, capital markets software cannot pause for rebuilds. New capabilities are introduced as stacked technology layers to meet evolving demands while systems remain active, data keeps moving, and controls stay intact. AI is no exception, and its opportunities are significant: accelerated decision cycles, compressed manual workflows, and more effective operations across complex environments. The constraint isn't the models themselves, but the architectural environments they enter ...

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue ...

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ... 

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...

AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.

98% of Cloud Deployments Experience Performance Issues

Jerry Melnick

A full 98% of cloud deployments experience some type of performance issue every year, according to a new survey by SIOS Technology, in partnership with ActualTech Media Research.

The survey was designed to understand current challenges and trends related to the state of performance and high availability for mission-critical applications in small, medium and large companies. A total of 390 IT professionals and decision-makers responded, collectively representing a cross-section those responsible for managing databases, infrastructure, architecture, cloud services and software development. Tier-1 applications explicitly identified include Oracle, Microsoft SQL Server and SAP/HANA.

There are some clear trends and a few surprises that we didn't see coming, and that might surprise you as well:

■ Small companies are leading the way to the public cloud with 54% planning to move more than half their mission-critical applications there by the end of 2018, which compares to 42% of large companies

■ For companies of all sizes, having complete control over the application environment was cited by 60% of the respondents as a key reason for why their mission-critical workloads remain on premises

■ Most (86%) organizations are using some form of failover clustering or other high availability mechanism for their mission-critical applications

■ Almost as many (95%) report having experienced a failure in their failover provisions

It's evident that organizations are finally moving their critical applications to the cloud, and at a greater pace than we could have imagined a few years ago. But they're still in the early days of adoption, placing mature operations a few years away. Here are some more details.

Misery Loves Company

A mere 2% of respondents claimed they never experience any application performance issues that ever affect any end users. The rest of us mere mortals claim to experience such issues, on average, daily (18%), 2-3 times per week (17%), once per week (10%), 2-3 times per month (15%), once per month (11%), 3-5 times per year (18%) or only once per year (8%).

The responses were reasonably consistent among Decision Makers, IT Staff, and Data & Development Staff with one notable exception: Decision Makers perceive a lower occurrence of performance issues than staff does. Nearly half (46%) of Decision Makers responded that performance issues occur 3-5 times per year or less (compared to 23-25% for staff), and only 11% responded that issues occur daily (compared to 20-21% for staff).

Rapid Response to the Rescue

One possible explanation for this apparent discrepancy is IT Staff being made aware of problems affecting performance with an automated alert, followed by a rapid response to find and fix the cause.

When asked about high availability provisions failing (something that is certain to affect performance!), 77% learn of the problem via an alert from monitoring tools, while 39% learn from a user complaint. (Note that multiple responses were permitted.)

As for remediation, it takes more than 5 hours to fix a problem only 3% of the time. Nearly a quarter (23%) are fixed in less than an hour, over half (56%) are fixed in 1-3 hours, and 18% are fixed in 3-5 hours. Small companies are able to resolve problems more quickly (31% in less than an hour) than large ones (only 11% in less than an hour), likely because the former utilizes the public cloud more extensively and has less complex configurations.

Culprits in the Cloud

When asked about the cause of performance issues that arise in the cloud, the main culprits are the application or the database being used, which together accounted for 64% of the issues. It is important to note that this question did not distinguish between who is responsible for the managing the application and/or database, which would likely be the cloud service provider for a managed service. Additional causes include issues with the service provider (17%) or the infrastructure (15%). In 4% of the cases, the issue remained a mystery.

Hot Topics

The Latest

In live financial environments, capital markets software cannot pause for rebuilds. New capabilities are introduced as stacked technology layers to meet evolving demands while systems remain active, data keeps moving, and controls stay intact. AI is no exception, and its opportunities are significant: accelerated decision cycles, compressed manual workflows, and more effective operations across complex environments. The constraint isn't the models themselves, but the architectural environments they enter ...

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue ...

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ... 

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...

AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.