Skip to main content

The Changing Face of Network Downtime

Vess Bakalov

Our connected world continues to transform into a mobile one. The network is a constant and fascinating companion, which grants us 24/7 access where communication is instant and takes place across an array of devices, unconstrained by physical barriers. As a result, the IT infrastructure is more critical than ever for business operations. Companies and organizations are calling upon a variety of technologies that are changing the face of today’s network — from mobile devices, to cloud services, to web-based applications.

And the strain on the network is not expected to decrease. In fact, Cisco reports that in two years, the number of devices connected to IP networks will be nearly three times that of the global population. At the same time, network management and performance challenges are also on the rise. The explosion of mobile, cloud and web-based apps make it difficult to determine where in today’s evolving world, the network begins and where it ends. As a result, service issues and outages are becoming more commonplace, prompting losses in revenue, customer satisfaction and employee productivity. A recent survey from Avaya speaks to the cost of network downtime, addressing the large degree of variance based on the characteristics of a business and environment (i.e., your vertical, risk tolerance, etc.), indicating the range is from $140K to $540K per hour.

Over the past couple of months, we’ve seen high-profile network outages capturing headlines across the US. A large number of service providers were affected by the 512K Day issue – when the Internet routing table grew beyond what many legacy routers were designed to handle. Then, in August more than 11 million Time Warner Cable (TWC) subscribers across 29 states lost service for about three hours, and just a week later, Facebook suffered its fourth outage over the past five months. Unavailability in two of the three previously mentioned cases was blamed on configuration glitches and as a result, quickly resolved.

The Most Important Word for Every Network: Availability

But why do network outages seem to be popping up more frequently, affecting more people? It’s really a question of perception – more people are consuming more services and everyone expects to be connected around the clock, around the world, using any device.

In a blog post earlier this summer, Andrew Lerner, a Research Director for Gartner, zeroed in on the most important word associated with every network: availability. As he notes, “Performance, scalability, management, agility, etc. all require the network to actually be online.”

Unfortunately, availability is assumed to be table stakes to most companies. I am not sure I agree with him entirely. Availability is table stakes. However, modern infrastructure — especially in service providers — is massively redundant. Pure availability is rarely the problem. More often service outages are due to poor capacity planning, spurious events or changes that bring unanticipated consequences (like Pakistan inadvertently re-routing all YouTube traffic).

For smaller businesses in particular, unavailability of core services not only represents a loss of control and a loss of earnings, but also potentially a lesson in reputational damage. Without network performance management solutions, businesses are unnecessarily exposing themselves to risk. Technology should be detecting and even preventing outages automatically, without the need for manual intervention. Technical staff cannot be expected to continually gather and analyze data that might indicate an impending outage, nor can they be expected to act quickly enough to stave off an incident. While the likes of TWC and Facebook can rapidly recover from disruptive infrastructure issues, smaller organizations can’t, and that is why they must take steps to protect themselves.

Reacting to performance thresholds is not enough. To ensure a company’s network is available 24/7, it’s critical to predict problems before they become service impacting. The deployment of solutions that log data and provide real-time analytics on large volumes of unstructured data are crucial to every IT department. These solutions provide IT organizations the opportunity to gain better insight into the behavior of users, customers, applications and networks, allowing businesses to spot issues before they happen – significantly reducing, or in some cases, eliminating downtime altogether.

Vess Bakalov is SVP, CTO and Co-Founder of SevOne.

The Latest

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...

AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.

The quietest week your engineering team has ever had might also be its best. No alarms going off. No escalations. No frantic Teams or Slack threads at 2 a.m. Everything humming along exactly as it should. And somewhere in a leadership meeting, someone looks at the metrics dashboard, sees a flat line of incidents and says: "Seems like things are pretty calm over there. Do we really need all those people?" ... I've spent many years in engineering, and this pattern keeps repeating ...

The gap is widening between what teams spend on observability tools and the value they receive amid surging data volumes and budget pressures, according to The Breaking Point for Observability Leaders, a report from Imply ...

Seamless shopping is a basic demand of today's boundaryless consumer — one with little patience for friction, limited tolerance for disconnected experiences and minimal hesitation in switching brands. Customers expect intuitive, highly personalized experiences and the ability to move effortlessly across physical and digital channels within the same journey. Failure to deliver can cost dearly ...

If your best engineers spend their days sorting tickets and resetting access, you are wasting talent. New global data shows that employees in the IT sector rank among the least motivated across industries. They're under a lot of pressure from many angles. Pressure to upskill and uncertainty around what agentic AI means for job security is creating anxiety. Meanwhile, these roles often function like an on-call job and require many repetitive tasks ...

The Changing Face of Network Downtime

Vess Bakalov

Our connected world continues to transform into a mobile one. The network is a constant and fascinating companion, which grants us 24/7 access where communication is instant and takes place across an array of devices, unconstrained by physical barriers. As a result, the IT infrastructure is more critical than ever for business operations. Companies and organizations are calling upon a variety of technologies that are changing the face of today’s network — from mobile devices, to cloud services, to web-based applications.

And the strain on the network is not expected to decrease. In fact, Cisco reports that in two years, the number of devices connected to IP networks will be nearly three times that of the global population. At the same time, network management and performance challenges are also on the rise. The explosion of mobile, cloud and web-based apps make it difficult to determine where in today’s evolving world, the network begins and where it ends. As a result, service issues and outages are becoming more commonplace, prompting losses in revenue, customer satisfaction and employee productivity. A recent survey from Avaya speaks to the cost of network downtime, addressing the large degree of variance based on the characteristics of a business and environment (i.e., your vertical, risk tolerance, etc.), indicating the range is from $140K to $540K per hour.

Over the past couple of months, we’ve seen high-profile network outages capturing headlines across the US. A large number of service providers were affected by the 512K Day issue – when the Internet routing table grew beyond what many legacy routers were designed to handle. Then, in August more than 11 million Time Warner Cable (TWC) subscribers across 29 states lost service for about three hours, and just a week later, Facebook suffered its fourth outage over the past five months. Unavailability in two of the three previously mentioned cases was blamed on configuration glitches and as a result, quickly resolved.

The Most Important Word for Every Network: Availability

But why do network outages seem to be popping up more frequently, affecting more people? It’s really a question of perception – more people are consuming more services and everyone expects to be connected around the clock, around the world, using any device.

In a blog post earlier this summer, Andrew Lerner, a Research Director for Gartner, zeroed in on the most important word associated with every network: availability. As he notes, “Performance, scalability, management, agility, etc. all require the network to actually be online.”

Unfortunately, availability is assumed to be table stakes to most companies. I am not sure I agree with him entirely. Availability is table stakes. However, modern infrastructure — especially in service providers — is massively redundant. Pure availability is rarely the problem. More often service outages are due to poor capacity planning, spurious events or changes that bring unanticipated consequences (like Pakistan inadvertently re-routing all YouTube traffic).

For smaller businesses in particular, unavailability of core services not only represents a loss of control and a loss of earnings, but also potentially a lesson in reputational damage. Without network performance management solutions, businesses are unnecessarily exposing themselves to risk. Technology should be detecting and even preventing outages automatically, without the need for manual intervention. Technical staff cannot be expected to continually gather and analyze data that might indicate an impending outage, nor can they be expected to act quickly enough to stave off an incident. While the likes of TWC and Facebook can rapidly recover from disruptive infrastructure issues, smaller organizations can’t, and that is why they must take steps to protect themselves.

Reacting to performance thresholds is not enough. To ensure a company’s network is available 24/7, it’s critical to predict problems before they become service impacting. The deployment of solutions that log data and provide real-time analytics on large volumes of unstructured data are crucial to every IT department. These solutions provide IT organizations the opportunity to gain better insight into the behavior of users, customers, applications and networks, allowing businesses to spot issues before they happen – significantly reducing, or in some cases, eliminating downtime altogether.

Vess Bakalov is SVP, CTO and Co-Founder of SevOne.

The Latest

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...

AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.

The quietest week your engineering team has ever had might also be its best. No alarms going off. No escalations. No frantic Teams or Slack threads at 2 a.m. Everything humming along exactly as it should. And somewhere in a leadership meeting, someone looks at the metrics dashboard, sees a flat line of incidents and says: "Seems like things are pretty calm over there. Do we really need all those people?" ... I've spent many years in engineering, and this pattern keeps repeating ...

The gap is widening between what teams spend on observability tools and the value they receive amid surging data volumes and budget pressures, according to The Breaking Point for Observability Leaders, a report from Imply ...

Seamless shopping is a basic demand of today's boundaryless consumer — one with little patience for friction, limited tolerance for disconnected experiences and minimal hesitation in switching brands. Customers expect intuitive, highly personalized experiences and the ability to move effortlessly across physical and digital channels within the same journey. Failure to deliver can cost dearly ...

If your best engineers spend their days sorting tickets and resetting access, you are wasting talent. New global data shows that employees in the IT sector rank among the least motivated across industries. They're under a lot of pressure from many angles. Pressure to upskill and uncertainty around what agentic AI means for job security is creating anxiety. Meanwhile, these roles often function like an on-call job and require many repetitive tasks ...