Skip to main content

High-Profile IT Outages Set Alarm Bells Ringing in Boardrooms Around the World

Gregg Ostrowski
AppDynamics

In a world where digital services have become a critical part of how we go about our daily lives, the risk of undergoing an outage has become even more significant. Outages can range in severity and impact companies of every size — while outages from larger companies in the social media space or a cloud provider tend to receive a lot of coverage, application downtime from even the most targeted companies can disrupt users' personal and business operations.

In addition to putting more pressure on the IT teams to resolve the issue, the company can also be at risk to lose revenue and customer loyalty. For many technologists, these outages have served as a reminder of how these types of firestorms can ignite in a flash and intensify the difficulties of getting them back under control.

Consumer expectations around reliability and performance for digital services have soared over the last 18 months, and most of us now have zero tolerance for anything less than the very best digital experiences. The moment we encounter a performance issue, we immediately switch to an alternative provider, and in some cases, we refuse to return. While Meta will undoubtedly recover from its recent troubles, the reputational and financial cost of any kind of outage could be crippling for some businesses.

In the wake of these recent events, Cisco AppDynamics conducted a global pulse survey of 1,000 IT decision makers (across 11 countries) to gauge whether these types of high-profile outages have caused increased concerns about digital disruption within their own organizations and about the adequacy of the measures they have in place to mitigate against this risk.

The findings give a fascinating insight into the challenges facing enterprise technologists in today's current environment. Not only did 87% admit that they are concerned about the potential for a major outage and the resulting disruption to their applications and digital services, but as many as 84% reported that they are coming under increasing pressure from their organization's leadership to proactively prevent a major performance issue or outage.

With stakes rising ever higher, the IT department has become a pressure cooker within many organizations. I know from my own time as VP of enterprise services that the burden to keep applications and digital services up and running at all times can be all consuming for a technologist.

What's now making this situation even more challenging is that technologists are having to look after an ever more complex IT estate. All while quickly rolling out new features ensuring an intuitive interface and always available service in which the user simply wants it to work when they want it. Requiring businesses to innovate at breakneck speed during the pandemic in order to meet dramatically changing customer and employee needs. And this has necessitated rapid digital transformation and a seismic shift towards cloud computing over the last 18 months. The unwanted side effect of this is massive technology sprawl, with IT departments now managing a vast patchwork of legacy and cloud technologies.

For technologists tasked with optimizing IT performance, things have become much more difficult. 87% of those we polled said the increasing complexity of their IT stack is causing long delays in identifying the root cause of performance issues. They simply can't cut through the complexity and overwhelming volumes of data to quickly and accurately identify issues before they impact the end user.

High profile outages like those we've seen over the last couple of weeks are a stark reminder for many technologists of the urgent need to address this problem before their worst fears come to fruition.

Encouragingly, our survey suggests that most technologists are taking steps to ensure they have the tools and insights they need to manage IT performance. 97% of IT teams currently have some form of monitoring tools in place, many of which provide highly sophisticated and advanced solutions to identifying and fixing anomalies.

The problem is that many technologists doubt the effectiveness of their current monitoring tools in this new world of spiraling IT complexity — only a quarter (27%) claim to be totally confident that these tools meet their growing needs. Indeed, these concerns are fully justified — many traditional monitoring tools still don't provide a unified view of IT performance up and down the IT stack and very few are able to effectively monitor legacy, hybrid and cloud environments.

Technologists are acutely aware they urgently need a newer approach to managing IT performance. In fact, almost three quarters (72%) believe their organization needs to deploy a full-stack observability solution within the next 12 months to enable them to solve complexity across their IT stack and to easily identify and fix the root causes of performance issues.

With full-stack observability in place, technologists can get unified, real-time visibility into IT performance up and down the IT stack, from customer-facing applications right through to core infrastructure, such as compute, storage, network and public internet and inter-services' dependencies. It also means that technologists can quickly identify causes and locations of incidents and sub-performance, rather than be on the back foot, spending valuable time trying to understand an issue.

But even with full-stack observability in place, technologists can still struggle to pinpoint those issues that really could cause serious damage. They're bombarded with a deluge of IT performance data from across their IT infrastructure and it's very difficult to cut through it to know what really matters most.

This is why having a business lens on IT performance is so important. It allows technologists to immediately identify the issues that could have the biggest impact on customers and the business and be confident knowing that they are focusing their energy in exactly the right places.

By connecting full-stack observability with real-time business metrics, technologists can optimize IT performance at all times and ensure they're able to meet the heightened expectations of today's consumers. And hopefully it means they can sleep more soundly at night!

Gregg Ostrowski is CTO Advisor at Cisco AppDynamics

Hot Topics

The Latest

In MEAN TIME TO INSIGHT Episode 12, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses purchasing new network observability solutions.... 

There's an image problem with mobile app security. While it's critical for highly regulated industries like financial services, it is often overlooked in others. This usually comes down to development priorities, which typically fall into three categories: user experience, app performance, and app security. When dealing with finite resources such as time, shifting priorities, and team skill sets, engineering teams often have to prioritize one over the others. Usually, security is the odd man out ...

Image
Guardsquare

IT outages, caused by poor-quality software updates, are no longer rare incidents but rather frequent occurrences, directly impacting over half of US consumers. According to the 2024 Software Failure Sentiment Report from Harness, many now equate these failures to critical public health crises ...

In just a few months, Google will again head to Washington DC and meet with the government for a two-week remedy trial to cement the fate of what happens to Chrome and its search business in the face of ongoing antitrust court case(s). Or, Google may proactively decide to make changes, putting the power in its hands to outline a suitable remedy. Regardless of the outcome, one thing is sure: there will be far more implications for AI than just a shift in Google's Search business ... 

Image
Chrome

In today's fast-paced digital world, Application Performance Monitoring (APM) is crucial for maintaining the health of an organization's digital ecosystem. However, the complexities of modern IT environments, including distributed architectures, hybrid clouds, and dynamic workloads, present significant challenges ... This blog explores the challenges of implementing application performance monitoring (APM) and offers strategies for overcoming them ...

Service disruptions remain a critical concern for IT and business executives, with 88% of respondents saying they believe another major incident will occur in the next 12 months, according to a study from PagerDuty ...

IT infrastructure (on-premises, cloud, or hybrid) is becoming larger and more complex. IT management tools need data to drive better decision making and more process automation to complement manual intervention by IT staff. That is why smart organizations invest in the systems and strategies needed to make their IT infrastructure more resilient in the event of disruption, and why many are turning to application performance monitoring (APM) in conjunction with high availability (HA) clusters ...

In today's data-driven world, the management of databases has become increasingly complex and critical. The following are findings from Redgate's 2025 The State of the Database Landscape report ...

With the 2027 deadline for SAP S/4HANA migrations fast approaching, organizations are accelerating their transition plans ... For organizations that intend to remain on SAP ECC in the near-term, the focus has shifted to improving operational efficiencies and meeting demands for faster cycle times ...

As applications expand and systems intertwine, performance bottlenecks, quality lapses, and disjointed pipelines threaten progress. To stay ahead, leading organizations are turning to three foundational strategies: developer-first observability, API platform adoption, and sustainable test growth ...

High-Profile IT Outages Set Alarm Bells Ringing in Boardrooms Around the World

Gregg Ostrowski
AppDynamics

In a world where digital services have become a critical part of how we go about our daily lives, the risk of undergoing an outage has become even more significant. Outages can range in severity and impact companies of every size — while outages from larger companies in the social media space or a cloud provider tend to receive a lot of coverage, application downtime from even the most targeted companies can disrupt users' personal and business operations.

In addition to putting more pressure on the IT teams to resolve the issue, the company can also be at risk to lose revenue and customer loyalty. For many technologists, these outages have served as a reminder of how these types of firestorms can ignite in a flash and intensify the difficulties of getting them back under control.

Consumer expectations around reliability and performance for digital services have soared over the last 18 months, and most of us now have zero tolerance for anything less than the very best digital experiences. The moment we encounter a performance issue, we immediately switch to an alternative provider, and in some cases, we refuse to return. While Meta will undoubtedly recover from its recent troubles, the reputational and financial cost of any kind of outage could be crippling for some businesses.

In the wake of these recent events, Cisco AppDynamics conducted a global pulse survey of 1,000 IT decision makers (across 11 countries) to gauge whether these types of high-profile outages have caused increased concerns about digital disruption within their own organizations and about the adequacy of the measures they have in place to mitigate against this risk.

The findings give a fascinating insight into the challenges facing enterprise technologists in today's current environment. Not only did 87% admit that they are concerned about the potential for a major outage and the resulting disruption to their applications and digital services, but as many as 84% reported that they are coming under increasing pressure from their organization's leadership to proactively prevent a major performance issue or outage.

With stakes rising ever higher, the IT department has become a pressure cooker within many organizations. I know from my own time as VP of enterprise services that the burden to keep applications and digital services up and running at all times can be all consuming for a technologist.

What's now making this situation even more challenging is that technologists are having to look after an ever more complex IT estate. All while quickly rolling out new features ensuring an intuitive interface and always available service in which the user simply wants it to work when they want it. Requiring businesses to innovate at breakneck speed during the pandemic in order to meet dramatically changing customer and employee needs. And this has necessitated rapid digital transformation and a seismic shift towards cloud computing over the last 18 months. The unwanted side effect of this is massive technology sprawl, with IT departments now managing a vast patchwork of legacy and cloud technologies.

For technologists tasked with optimizing IT performance, things have become much more difficult. 87% of those we polled said the increasing complexity of their IT stack is causing long delays in identifying the root cause of performance issues. They simply can't cut through the complexity and overwhelming volumes of data to quickly and accurately identify issues before they impact the end user.

High profile outages like those we've seen over the last couple of weeks are a stark reminder for many technologists of the urgent need to address this problem before their worst fears come to fruition.

Encouragingly, our survey suggests that most technologists are taking steps to ensure they have the tools and insights they need to manage IT performance. 97% of IT teams currently have some form of monitoring tools in place, many of which provide highly sophisticated and advanced solutions to identifying and fixing anomalies.

The problem is that many technologists doubt the effectiveness of their current monitoring tools in this new world of spiraling IT complexity — only a quarter (27%) claim to be totally confident that these tools meet their growing needs. Indeed, these concerns are fully justified — many traditional monitoring tools still don't provide a unified view of IT performance up and down the IT stack and very few are able to effectively monitor legacy, hybrid and cloud environments.

Technologists are acutely aware they urgently need a newer approach to managing IT performance. In fact, almost three quarters (72%) believe their organization needs to deploy a full-stack observability solution within the next 12 months to enable them to solve complexity across their IT stack and to easily identify and fix the root causes of performance issues.

With full-stack observability in place, technologists can get unified, real-time visibility into IT performance up and down the IT stack, from customer-facing applications right through to core infrastructure, such as compute, storage, network and public internet and inter-services' dependencies. It also means that technologists can quickly identify causes and locations of incidents and sub-performance, rather than be on the back foot, spending valuable time trying to understand an issue.

But even with full-stack observability in place, technologists can still struggle to pinpoint those issues that really could cause serious damage. They're bombarded with a deluge of IT performance data from across their IT infrastructure and it's very difficult to cut through it to know what really matters most.

This is why having a business lens on IT performance is so important. It allows technologists to immediately identify the issues that could have the biggest impact on customers and the business and be confident knowing that they are focusing their energy in exactly the right places.

By connecting full-stack observability with real-time business metrics, technologists can optimize IT performance at all times and ensure they're able to meet the heightened expectations of today's consumers. And hopefully it means they can sleep more soundly at night!

Gregg Ostrowski is CTO Advisor at Cisco AppDynamics

Hot Topics

The Latest

In MEAN TIME TO INSIGHT Episode 12, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses purchasing new network observability solutions.... 

There's an image problem with mobile app security. While it's critical for highly regulated industries like financial services, it is often overlooked in others. This usually comes down to development priorities, which typically fall into three categories: user experience, app performance, and app security. When dealing with finite resources such as time, shifting priorities, and team skill sets, engineering teams often have to prioritize one over the others. Usually, security is the odd man out ...

Image
Guardsquare

IT outages, caused by poor-quality software updates, are no longer rare incidents but rather frequent occurrences, directly impacting over half of US consumers. According to the 2024 Software Failure Sentiment Report from Harness, many now equate these failures to critical public health crises ...

In just a few months, Google will again head to Washington DC and meet with the government for a two-week remedy trial to cement the fate of what happens to Chrome and its search business in the face of ongoing antitrust court case(s). Or, Google may proactively decide to make changes, putting the power in its hands to outline a suitable remedy. Regardless of the outcome, one thing is sure: there will be far more implications for AI than just a shift in Google's Search business ... 

Image
Chrome

In today's fast-paced digital world, Application Performance Monitoring (APM) is crucial for maintaining the health of an organization's digital ecosystem. However, the complexities of modern IT environments, including distributed architectures, hybrid clouds, and dynamic workloads, present significant challenges ... This blog explores the challenges of implementing application performance monitoring (APM) and offers strategies for overcoming them ...

Service disruptions remain a critical concern for IT and business executives, with 88% of respondents saying they believe another major incident will occur in the next 12 months, according to a study from PagerDuty ...

IT infrastructure (on-premises, cloud, or hybrid) is becoming larger and more complex. IT management tools need data to drive better decision making and more process automation to complement manual intervention by IT staff. That is why smart organizations invest in the systems and strategies needed to make their IT infrastructure more resilient in the event of disruption, and why many are turning to application performance monitoring (APM) in conjunction with high availability (HA) clusters ...

In today's data-driven world, the management of databases has become increasingly complex and critical. The following are findings from Redgate's 2025 The State of the Database Landscape report ...

With the 2027 deadline for SAP S/4HANA migrations fast approaching, organizations are accelerating their transition plans ... For organizations that intend to remain on SAP ECC in the near-term, the focus has shifted to improving operational efficiencies and meeting demands for faster cycle times ...

As applications expand and systems intertwine, performance bottlenecks, quality lapses, and disjointed pipelines threaten progress. To stay ahead, leading organizations are turning to three foundational strategies: developer-first observability, API platform adoption, and sustainable test growth ...