The Changing Face of Network Downtime
October 02, 2014

Vess Bakalov
SevOne

Share this

Our connected world continues to transform into a mobile one. The network is a constant and fascinating companion, which grants us 24/7 access where communication is instant and takes place across an array of devices, unconstrained by physical barriers. As a result, the IT infrastructure is more critical than ever for business operations. Companies and organizations are calling upon a variety of technologies that are changing the face of today’s network — from mobile devices, to cloud services, to web-based applications.

And the strain on the network is not expected to decrease. In fact, Cisco reports that in two years, the number of devices connected to IP networks will be nearly three times that of the global population. At the same time, network management and performance challenges are also on the rise. The explosion of mobile, cloud and web-based apps make it difficult to determine where in today’s evolving world, the network begins and where it ends. As a result, service issues and outages are becoming more commonplace, prompting losses in revenue, customer satisfaction and employee productivity. A recent survey from Avaya speaks to the cost of network downtime, addressing the large degree of variance based on the characteristics of a business and environment (i.e., your vertical, risk tolerance, etc.), indicating the range is from $140K to $540K per hour.

Over the past couple of months, we’ve seen high-profile network outages capturing headlines across the US. A large number of service providers were affected by the 512K Day issue – when the Internet routing table grew beyond what many legacy routers were designed to handle. Then, in August more than 11 million Time Warner Cable (TWC) subscribers across 29 states lost service for about three hours, and just a week later, Facebook suffered its fourth outage over the past five months. Unavailability in two of the three previously mentioned cases was blamed on configuration glitches and as a result, quickly resolved.

The Most Important Word for Every Network: Availability

But why do network outages seem to be popping up more frequently, affecting more people? It’s really a question of perception – more people are consuming more services and everyone expects to be connected around the clock, around the world, using any device.

In a blog post earlier this summer, Andrew Lerner, a Research Director for Gartner, zeroed in on the most important word associated with every network: availability. As he notes, “Performance, scalability, management, agility, etc. all require the network to actually be online.”

Unfortunately, availability is assumed to be table stakes to most companies. I am not sure I agree with him entirely. Availability is table stakes. However, modern infrastructure — especially in service providers — is massively redundant. Pure availability is rarely the problem. More often service outages are due to poor capacity planning, spurious events or changes that bring unanticipated consequences (like Pakistan inadvertently re-routing all YouTube traffic).

For smaller businesses in particular, unavailability of core services not only represents a loss of control and a loss of earnings, but also potentially a lesson in reputational damage. Without network performance management solutions, businesses are unnecessarily exposing themselves to risk. Technology should be detecting and even preventing outages automatically, without the need for manual intervention. Technical staff cannot be expected to continually gather and analyze data that might indicate an impending outage, nor can they be expected to act quickly enough to stave off an incident. While the likes of TWC and Facebook can rapidly recover from disruptive infrastructure issues, smaller organizations can’t, and that is why they must take steps to protect themselves.

Reacting to performance thresholds is not enough. To ensure a company’s network is available 24/7, it’s critical to predict problems before they become service impacting. The deployment of solutions that log data and provide real-time analytics on large volumes of unstructured data are crucial to every IT department. These solutions provide IT organizations the opportunity to gain better insight into the behavior of users, customers, applications and networks, allowing businesses to spot issues before they happen – significantly reducing, or in some cases, eliminating downtime altogether.

Vess Bakalov is SVP, CTO and Co-Founder of SevOne.

Share this

The Latest

January 18, 2022

As part of APMdigest's list of 2022 predictions, industry experts offer thoughtful, insightful, and often controversial predictions on how Network Performance Management (NPM) and related technologies will evolve and impact business in 2022 ...

January 13, 2022

Gartner highlighted 6 trends that infrastructure and operations (I&O) leaders must start preparing for in the next 12-18 months ...

January 11, 2022

Technology is now foundational to financial companies' operations with many institutions relying on tech to deliver critical services. As a result, uptime is essential to customer satisfaction and company success, and systems must be subject to continuous monitoring. But modern IT architectures are disparate, complex and interconnected, and the data is too voluminous for the human mind to handle. Enter AIOps ...

January 11, 2022

Having a variety of tools to choose from creates challenges in telemetry data collection. Organizations find themselves managing multiple libraries for logging, metrics, and traces, with each vendor having its own APIs, SDKs, agents, and collectors. An open source, community-driven approach to observability will gain steam in 2022 to remove unnecessary complications by tapping into the latest advancements in observability practice ...

January 10, 2022

These are the trends that will set up your engineers and developers to deliver amazing software that powers amazing digital experiences that fuel your organization's growth in 2022 — and beyond ...

January 06, 2022

In a world where digital services have become a critical part of how we go about our daily lives, the risk of undergoing an outage has become even more significant. Outages can range in severity and impact companies of every size — while outages from larger companies in the social media space or a cloud provider tend to receive a lot of coverage, application downtime from even the most targeted companies can disrupt users' personal and business operations ...

January 05, 2022

Move fast and break things: A phrase that has been a rallying cry for many SREs and DevOps practitioners. After all, these teams are charged with delivering rapid and unceasing innovation to wow customers and keep pace with competitors. But today's society doesn't tolerate broken things (aka downtime). So, what if you can move fast and not break things? Or at least, move fast and rapidly identify or even predict broken things? It's high time to rethink the old rallying cry, and with AI and observability working in tandem, it's possible ...

January 04, 2022

AIOps is still relatively new compared to existing technologies such as enterprise data warehouses, and early on many AIOps projects suffered hiccups, the aftereffects of which are still felt today. That's why, for some IT Ops teams and leaders, the prospect of transforming their IT operations using AIOps is a cause for concern ...

December 16, 2021

This year is the first time APMdigest is posting a separate list of Remote Work Predictions. Due to the drastic changes in the way we work and do business since the COVID pandemic started, and how significantly these changes have impacted IT operations, APMdigest asked industry experts — from analysts and consultants to users and the top vendors — how they think the work from home (WFH) revolution will evolve into 2022, with a special focus on IT operations and performance. Here are some very interesting and insightful predictions that may change what you think about the future of work and IT ...

December 15, 2021

Industry experts offer thoughtful, insightful, and often controversial predictions on how APM, AIOps, Observability, OpenTelemetry, and related technologies will evolve and impact business in 2022. Part 6 covers the user experience ...