Skip to main content

98% of Cloud Deployments Experience Performance Issues

Jerry Melnick

A full 98% of cloud deployments experience some type of performance issue every year, according to a new survey by SIOS Technology, in partnership with ActualTech Media Research.

The survey was designed to understand current challenges and trends related to the state of performance and high availability for mission-critical applications in small, medium and large companies. A total of 390 IT professionals and decision-makers responded, collectively representing a cross-section those responsible for managing databases, infrastructure, architecture, cloud services and software development. Tier-1 applications explicitly identified include Oracle, Microsoft SQL Server and SAP/HANA.

There are some clear trends and a few surprises that we didn't see coming, and that might surprise you as well:

■ Small companies are leading the way to the public cloud with 54% planning to move more than half their mission-critical applications there by the end of 2018, which compares to 42% of large companies

■ For companies of all sizes, having complete control over the application environment was cited by 60% of the respondents as a key reason for why their mission-critical workloads remain on premises

■ Most (86%) organizations are using some form of failover clustering or other high availability mechanism for their mission-critical applications

■ Almost as many (95%) report having experienced a failure in their failover provisions

It's evident that organizations are finally moving their critical applications to the cloud, and at a greater pace than we could have imagined a few years ago. But they're still in the early days of adoption, placing mature operations a few years away. Here are some more details.

Misery Loves Company

A mere 2% of respondents claimed they never experience any application performance issues that ever affect any end users. The rest of us mere mortals claim to experience such issues, on average, daily (18%), 2-3 times per week (17%), once per week (10%), 2-3 times per month (15%), once per month (11%), 3-5 times per year (18%) or only once per year (8%).

The responses were reasonably consistent among Decision Makers, IT Staff, and Data & Development Staff with one notable exception: Decision Makers perceive a lower occurrence of performance issues than staff does. Nearly half (46%) of Decision Makers responded that performance issues occur 3-5 times per year or less (compared to 23-25% for staff), and only 11% responded that issues occur daily (compared to 20-21% for staff).

Rapid Response to the Rescue

One possible explanation for this apparent discrepancy is IT Staff being made aware of problems affecting performance with an automated alert, followed by a rapid response to find and fix the cause.

When asked about high availability provisions failing (something that is certain to affect performance!), 77% learn of the problem via an alert from monitoring tools, while 39% learn from a user complaint. (Note that multiple responses were permitted.)

As for remediation, it takes more than 5 hours to fix a problem only 3% of the time. Nearly a quarter (23%) are fixed in less than an hour, over half (56%) are fixed in 1-3 hours, and 18% are fixed in 3-5 hours. Small companies are able to resolve problems more quickly (31% in less than an hour) than large ones (only 11% in less than an hour), likely because the former utilizes the public cloud more extensively and has less complex configurations.

Culprits in the Cloud

When asked about the cause of performance issues that arise in the cloud, the main culprits are the application or the database being used, which together accounted for 64% of the issues. It is important to note that this question did not distinguish between who is responsible for the managing the application and/or database, which would likely be the cloud service provider for a managed service. Additional causes include issues with the service provider (17%) or the infrastructure (15%). In 4% of the cases, the issue remained a mystery.

Hot Topics

The Latest

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

98% of Cloud Deployments Experience Performance Issues

Jerry Melnick

A full 98% of cloud deployments experience some type of performance issue every year, according to a new survey by SIOS Technology, in partnership with ActualTech Media Research.

The survey was designed to understand current challenges and trends related to the state of performance and high availability for mission-critical applications in small, medium and large companies. A total of 390 IT professionals and decision-makers responded, collectively representing a cross-section those responsible for managing databases, infrastructure, architecture, cloud services and software development. Tier-1 applications explicitly identified include Oracle, Microsoft SQL Server and SAP/HANA.

There are some clear trends and a few surprises that we didn't see coming, and that might surprise you as well:

■ Small companies are leading the way to the public cloud with 54% planning to move more than half their mission-critical applications there by the end of 2018, which compares to 42% of large companies

■ For companies of all sizes, having complete control over the application environment was cited by 60% of the respondents as a key reason for why their mission-critical workloads remain on premises

■ Most (86%) organizations are using some form of failover clustering or other high availability mechanism for their mission-critical applications

■ Almost as many (95%) report having experienced a failure in their failover provisions

It's evident that organizations are finally moving their critical applications to the cloud, and at a greater pace than we could have imagined a few years ago. But they're still in the early days of adoption, placing mature operations a few years away. Here are some more details.

Misery Loves Company

A mere 2% of respondents claimed they never experience any application performance issues that ever affect any end users. The rest of us mere mortals claim to experience such issues, on average, daily (18%), 2-3 times per week (17%), once per week (10%), 2-3 times per month (15%), once per month (11%), 3-5 times per year (18%) or only once per year (8%).

The responses were reasonably consistent among Decision Makers, IT Staff, and Data & Development Staff with one notable exception: Decision Makers perceive a lower occurrence of performance issues than staff does. Nearly half (46%) of Decision Makers responded that performance issues occur 3-5 times per year or less (compared to 23-25% for staff), and only 11% responded that issues occur daily (compared to 20-21% for staff).

Rapid Response to the Rescue

One possible explanation for this apparent discrepancy is IT Staff being made aware of problems affecting performance with an automated alert, followed by a rapid response to find and fix the cause.

When asked about high availability provisions failing (something that is certain to affect performance!), 77% learn of the problem via an alert from monitoring tools, while 39% learn from a user complaint. (Note that multiple responses were permitted.)

As for remediation, it takes more than 5 hours to fix a problem only 3% of the time. Nearly a quarter (23%) are fixed in less than an hour, over half (56%) are fixed in 1-3 hours, and 18% are fixed in 3-5 hours. Small companies are able to resolve problems more quickly (31% in less than an hour) than large ones (only 11% in less than an hour), likely because the former utilizes the public cloud more extensively and has less complex configurations.

Culprits in the Cloud

When asked about the cause of performance issues that arise in the cloud, the main culprits are the application or the database being used, which together accounted for 64% of the issues. It is important to note that this question did not distinguish between who is responsible for the managing the application and/or database, which would likely be the cloud service provider for a managed service. Additional causes include issues with the service provider (17%) or the infrastructure (15%). In 4% of the cases, the issue remained a mystery.

Hot Topics

The Latest

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...