Skip to main content

The Changing Face of Network Downtime

Vess Bakalov

Our connected world continues to transform into a mobile one. The network is a constant and fascinating companion, which grants us 24/7 access where communication is instant and takes place across an array of devices, unconstrained by physical barriers. As a result, the IT infrastructure is more critical than ever for business operations. Companies and organizations are calling upon a variety of technologies that are changing the face of today’s network — from mobile devices, to cloud services, to web-based applications.

And the strain on the network is not expected to decrease. In fact, Cisco reports that in two years, the number of devices connected to IP networks will be nearly three times that of the global population. At the same time, network management and performance challenges are also on the rise. The explosion of mobile, cloud and web-based apps make it difficult to determine where in today’s evolving world, the network begins and where it ends. As a result, service issues and outages are becoming more commonplace, prompting losses in revenue, customer satisfaction and employee productivity. A recent survey from Avaya speaks to the cost of network downtime, addressing the large degree of variance based on the characteristics of a business and environment (i.e., your vertical, risk tolerance, etc.), indicating the range is from $140K to $540K per hour.

Over the past couple of months, we’ve seen high-profile network outages capturing headlines across the US. A large number of service providers were affected by the 512K Day issue – when the Internet routing table grew beyond what many legacy routers were designed to handle. Then, in August more than 11 million Time Warner Cable (TWC) subscribers across 29 states lost service for about three hours, and just a week later, Facebook suffered its fourth outage over the past five months. Unavailability in two of the three previously mentioned cases was blamed on configuration glitches and as a result, quickly resolved.

The Most Important Word for Every Network: Availability

But why do network outages seem to be popping up more frequently, affecting more people? It’s really a question of perception – more people are consuming more services and everyone expects to be connected around the clock, around the world, using any device.

In a blog post earlier this summer, Andrew Lerner, a Research Director for Gartner, zeroed in on the most important word associated with every network: availability. As he notes, “Performance, scalability, management, agility, etc. all require the network to actually be online.”

Unfortunately, availability is assumed to be table stakes to most companies. I am not sure I agree with him entirely. Availability is table stakes. However, modern infrastructure — especially in service providers — is massively redundant. Pure availability is rarely the problem. More often service outages are due to poor capacity planning, spurious events or changes that bring unanticipated consequences (like Pakistan inadvertently re-routing all YouTube traffic).

For smaller businesses in particular, unavailability of core services not only represents a loss of control and a loss of earnings, but also potentially a lesson in reputational damage. Without network performance management solutions, businesses are unnecessarily exposing themselves to risk. Technology should be detecting and even preventing outages automatically, without the need for manual intervention. Technical staff cannot be expected to continually gather and analyze data that might indicate an impending outage, nor can they be expected to act quickly enough to stave off an incident. While the likes of TWC and Facebook can rapidly recover from disruptive infrastructure issues, smaller organizations can’t, and that is why they must take steps to protect themselves.

Reacting to performance thresholds is not enough. To ensure a company’s network is available 24/7, it’s critical to predict problems before they become service impacting. The deployment of solutions that log data and provide real-time analytics on large volumes of unstructured data are crucial to every IT department. These solutions provide IT organizations the opportunity to gain better insight into the behavior of users, customers, applications and networks, allowing businesses to spot issues before they happen – significantly reducing, or in some cases, eliminating downtime altogether.

Vess Bakalov is SVP, CTO and Co-Founder of SevOne.

The Latest

Every digital customer interaction, every cloud deployment, and every AI model depends on the same foundation: the ability to see, understand, and act on data in real time ... Recent data from Splunk confirms that 74% of the business leaders believe observability is essential to monitoring critical business processes, and 66% feel it's key to understanding user journeys. Because while the unknown is inevitable, observability makes it manageable. Let's explore why ...

Organizations that perform regular audits and assessments of AI system performance and compliance are over three times more likely to achieve high GenAI value than organizations that do not, according to a survey by Gartner ...

Kubernetes has become the backbone of cloud infrastructure, but it's also one of its biggest cost drivers. Recent research shows that 98% of senior IT leaders say Kubernetes now drives cloud spend, yet 91% still can't optimize it effectively. After years of adoption, most organizations have moved past discovery. They know container sprawl, idle resources and reactive scaling inflate costs. What they don't know is how to fix it ...

Artificial intelligence is no longer a future investment. It's already embedded in how we work — whether through copilots in productivity apps, real-time transcription tools in meetings, or machine learning models fueling analytics and personalization. But while enterprise adoption accelerates, there's one critical area many leaders have yet to examine: Can your network actually support AI at the speed your users expect? ...

The more technology businesses invest in, the more potential attack surfaces they have that can be exploited. Without the right continuity plans in place, the disruptions caused by these attacks can bring operations to a standstill and cause irreparable damage to an organization. It's essential to take the time now to ensure your business has the right tools, processes, and recovery initiatives in place to weather any type of IT disaster that comes up. Here are some effective strategies you can follow to achieve this ...

In today's fast-paced AI landscape, CIOs, IT leaders, and engineers are constantly challenged to manage increasingly complex and interconnected systems. The sheer scale and velocity of data generated by modern infrastructure can be overwhelming, making it difficult to maintain uptime, prevent outages, and create a seamless customer experience. This complexity is magnified by the industry's shift towards agentic AI ...

In MEAN TIME TO INSIGHT Episode 19, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA explains the cause of the AWS outage in October ... 

The explosion of generative AI and machine learning capabilities has fundamentally changed the conversation around cloud migration. It's no longer just about modernization or cost savings — it's about being able to compete in a market where AI is rapidly becoming table stakes. Companies that can't quickly spin up AI workloads, feed models with data at scale, or experiment with new capabilities are falling behind faster than ever before. But here's what I'm seeing: many organizations want to capitalize on AI, but they're stuck ...

On September 16, the world celebrated the 10th annual IT Pro Day, giving companies a chance to laud the professionals who serve as the backbone to almost every successful business across the globe. Despite the growing importance of their roles, many IT pros still work in the background and often go underappreciated ...

Artificial Intelligence (AI) is reshaping observability, and observability is becoming essential for AI. This is a two-way relationship that is increasingly relevant as enterprises scale generative AI ... This dual role makes AI and observability inseparable. In this blog, I cover more details of each side ...

The Changing Face of Network Downtime

Vess Bakalov

Our connected world continues to transform into a mobile one. The network is a constant and fascinating companion, which grants us 24/7 access where communication is instant and takes place across an array of devices, unconstrained by physical barriers. As a result, the IT infrastructure is more critical than ever for business operations. Companies and organizations are calling upon a variety of technologies that are changing the face of today’s network — from mobile devices, to cloud services, to web-based applications.

And the strain on the network is not expected to decrease. In fact, Cisco reports that in two years, the number of devices connected to IP networks will be nearly three times that of the global population. At the same time, network management and performance challenges are also on the rise. The explosion of mobile, cloud and web-based apps make it difficult to determine where in today’s evolving world, the network begins and where it ends. As a result, service issues and outages are becoming more commonplace, prompting losses in revenue, customer satisfaction and employee productivity. A recent survey from Avaya speaks to the cost of network downtime, addressing the large degree of variance based on the characteristics of a business and environment (i.e., your vertical, risk tolerance, etc.), indicating the range is from $140K to $540K per hour.

Over the past couple of months, we’ve seen high-profile network outages capturing headlines across the US. A large number of service providers were affected by the 512K Day issue – when the Internet routing table grew beyond what many legacy routers were designed to handle. Then, in August more than 11 million Time Warner Cable (TWC) subscribers across 29 states lost service for about three hours, and just a week later, Facebook suffered its fourth outage over the past five months. Unavailability in two of the three previously mentioned cases was blamed on configuration glitches and as a result, quickly resolved.

The Most Important Word for Every Network: Availability

But why do network outages seem to be popping up more frequently, affecting more people? It’s really a question of perception – more people are consuming more services and everyone expects to be connected around the clock, around the world, using any device.

In a blog post earlier this summer, Andrew Lerner, a Research Director for Gartner, zeroed in on the most important word associated with every network: availability. As he notes, “Performance, scalability, management, agility, etc. all require the network to actually be online.”

Unfortunately, availability is assumed to be table stakes to most companies. I am not sure I agree with him entirely. Availability is table stakes. However, modern infrastructure — especially in service providers — is massively redundant. Pure availability is rarely the problem. More often service outages are due to poor capacity planning, spurious events or changes that bring unanticipated consequences (like Pakistan inadvertently re-routing all YouTube traffic).

For smaller businesses in particular, unavailability of core services not only represents a loss of control and a loss of earnings, but also potentially a lesson in reputational damage. Without network performance management solutions, businesses are unnecessarily exposing themselves to risk. Technology should be detecting and even preventing outages automatically, without the need for manual intervention. Technical staff cannot be expected to continually gather and analyze data that might indicate an impending outage, nor can they be expected to act quickly enough to stave off an incident. While the likes of TWC and Facebook can rapidly recover from disruptive infrastructure issues, smaller organizations can’t, and that is why they must take steps to protect themselves.

Reacting to performance thresholds is not enough. To ensure a company’s network is available 24/7, it’s critical to predict problems before they become service impacting. The deployment of solutions that log data and provide real-time analytics on large volumes of unstructured data are crucial to every IT department. These solutions provide IT organizations the opportunity to gain better insight into the behavior of users, customers, applications and networks, allowing businesses to spot issues before they happen – significantly reducing, or in some cases, eliminating downtime altogether.

Vess Bakalov is SVP, CTO and Co-Founder of SevOne.

The Latest

Every digital customer interaction, every cloud deployment, and every AI model depends on the same foundation: the ability to see, understand, and act on data in real time ... Recent data from Splunk confirms that 74% of the business leaders believe observability is essential to monitoring critical business processes, and 66% feel it's key to understanding user journeys. Because while the unknown is inevitable, observability makes it manageable. Let's explore why ...

Organizations that perform regular audits and assessments of AI system performance and compliance are over three times more likely to achieve high GenAI value than organizations that do not, according to a survey by Gartner ...

Kubernetes has become the backbone of cloud infrastructure, but it's also one of its biggest cost drivers. Recent research shows that 98% of senior IT leaders say Kubernetes now drives cloud spend, yet 91% still can't optimize it effectively. After years of adoption, most organizations have moved past discovery. They know container sprawl, idle resources and reactive scaling inflate costs. What they don't know is how to fix it ...

Artificial intelligence is no longer a future investment. It's already embedded in how we work — whether through copilots in productivity apps, real-time transcription tools in meetings, or machine learning models fueling analytics and personalization. But while enterprise adoption accelerates, there's one critical area many leaders have yet to examine: Can your network actually support AI at the speed your users expect? ...

The more technology businesses invest in, the more potential attack surfaces they have that can be exploited. Without the right continuity plans in place, the disruptions caused by these attacks can bring operations to a standstill and cause irreparable damage to an organization. It's essential to take the time now to ensure your business has the right tools, processes, and recovery initiatives in place to weather any type of IT disaster that comes up. Here are some effective strategies you can follow to achieve this ...

In today's fast-paced AI landscape, CIOs, IT leaders, and engineers are constantly challenged to manage increasingly complex and interconnected systems. The sheer scale and velocity of data generated by modern infrastructure can be overwhelming, making it difficult to maintain uptime, prevent outages, and create a seamless customer experience. This complexity is magnified by the industry's shift towards agentic AI ...

In MEAN TIME TO INSIGHT Episode 19, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA explains the cause of the AWS outage in October ... 

The explosion of generative AI and machine learning capabilities has fundamentally changed the conversation around cloud migration. It's no longer just about modernization or cost savings — it's about being able to compete in a market where AI is rapidly becoming table stakes. Companies that can't quickly spin up AI workloads, feed models with data at scale, or experiment with new capabilities are falling behind faster than ever before. But here's what I'm seeing: many organizations want to capitalize on AI, but they're stuck ...

On September 16, the world celebrated the 10th annual IT Pro Day, giving companies a chance to laud the professionals who serve as the backbone to almost every successful business across the globe. Despite the growing importance of their roles, many IT pros still work in the background and often go underappreciated ...

Artificial Intelligence (AI) is reshaping observability, and observability is becoming essential for AI. This is a two-way relationship that is increasingly relevant as enterprises scale generative AI ... This dual role makes AI and observability inseparable. In this blog, I cover more details of each side ...