China's Web Outage: The Latest Earthquake to Rock the Internet

February 18, 2014

Learn more about Dynatrace

On Tuesday, January 21, one of the biggest outages in history — if not the world's largest outage — happened to the Internet in China. The web was essentially unavailable for one of the strongest and fastest growing economies for one full business day.

The initial reaction of the international press was somewhat lax – after all, the event was marginally important to web users outside China. But the fact is, while approximately 500 million Chinese web users were undoubtedly affected, every company that does online business in China was hurt.

Consider a company like Porsche, which has been experiencing double-digit revenue growth in China over the past few months. The hit on revenues and even brand image for such companies – even though the outage was completely beyond their control – was likely significant. Not to mention, major global businesses advertising on Chinese sites forfeited hefty investments that day.

A Look Inside the Outage

So what exactly happened? At around 3 p.m. local time on January 21, two-thirds of all domain requests in China were routed to a single IP address in Wyoming, which promptly collapsed under load. This was believed to be a domain name system (DNS) attack, the biggest of its type in history. Not all domains were affected; mainly it was those ending in .com and .net, while those ending in .com.cn were partially affected.

Unfortunately, even most of the Chinese websites that were not directly impacted also ended up going down. Here's why: many of the affected domains were hosts to third-party services relied upon by thousands of Chinese websites.

One example is analytics engines. Never mind that the analytics engines weren't working, meaning that companies lost out on a whole day's worth of data that could have been used to increase conversions. That was just the collateral damage. Like dominoes, these "poisoned" third-party services brought down the websites integrating them, even those websites that were not directly affected by the attack.

Another third-party service that went dark was PayPal. This meant that any website integrating PayPal on its back-end could not process transactions for a full eight hours – which was a moot point anyway, because these websites were likely inaccessible.

In this sense, the Chinese outage was a perfect case-in-point of what Compuware APM has been evangelizing for a long time. And that is: the increased complexity and interdependency of the modern web that can turn even the most well-run and well-developed website into a house of cards, on the verge of collapse at any moment.

But these days, reliance on third-party services is a way of life. These services enable website and web application developers to bring to market cutting edge services quickly and cost-effectively, without the burden of having to develop these services from scratch. However, the China example highlights how that reliance on third-party services comes with the downside of increased vulnerability and fragility.

Lessons Learned

In this era of increased interdependency, what can an organization do to better protect and insulate its web performance?

Organizations need to be better about getting ahead of website performance issues: Given all the performance-impacting elements standing between the data center and the end user – i.e. the cloud, CDNs, ISPs, devices and browsers – the end-user perspective is the only reliable vantage point from which to gauge performance. Next-generation application performance management (APM) tools can deliver this view, and it's important to work with technology providers that provide performance views across key geographies and user segments.

Organizations must closely evaluate and monitor third-party services: Before a third-party service is enlisted, organizations should carefully test its performance. One way is to compare website performance before a third-party service is added and afterwards, gauge the overall performance impact. If a performance degradation is identified, organizations must work with the third-party service to resolutely fix the problem, before the service is implemented.

Monitoring third-party services in production is also important in order to validate SLAs, but also to identify third-party performance issues as they occur and take appropriate action.

As the China example illustrates, the "ripple effect" of third-party performance issues is often unavoidable. But that doesn't mean the impact can't be thwarted or minimized. That is, when a serious performance problem is detected, organizations should have contingency plans in place so that offending third-party services can quickly be removed. While they can be extremely valuable when performing well, many third-party services (such as analytics) are not worth having if it means frustrating customers.

The end-user experience needs to be top-of-mind in all third-party service decisions: In general, websites should keep third-party services to a minimum. Organizations always need to ask themselves before adding a third-party service, if the added feature/functionality is worth the potential increase in overall vulnerability and lost conversions.

In this vein, there needs to be constant communication between performance monitoring teams, and the teams who request and depend on these third-party services. This is the key to making the smartest decisions that will protect and promote revenues above all else.

Additionally, when a third-party service is implemented, there are design steps organizations can take to proactively reduce risk exposure. For example, by understanding the load order of elements on a site and making sure third-party services and applications are on the bottom, organizations can protect and enhance perceived customer load time, even when a third-party service does suddenly go awry.

As a final note here, to ensure better performance for feature-rich websites and applications, many organizations rely on content delivery networks (CDNs) strategically located in key geographies. Ironically, CDNs represent another third-party service and another potential point of failure. Here, again, measuring performance from the true end-user perspective, on the other side of a CDN, is critical to protecting and maximizing these investments.

Leverage industry resources: Look for free services that identify third-party service outages and the corresponding regional impacts. Services like this may not prevent major outages from happening, but they can help organizations at least see when a widespread performance issue is not their own, and give them a head start in putting contingency plans into place and communicating proactively with customers.

Conclusion

In summary, to a certain extent, major web events like the one that just happened in China are unavoidable. But in many cases, the corresponding impact on modern websites can be anticipated, contained and minimized with the right approaches.

As a first step, organizations must understand the true end-user experience and the resulting business impact, so performance problems can be prioritized for remediation. From there, organizations must be able to correlate performance issues to the broadest possible range of variables both within and outside the firewall, including third-party services, and take appropriate action. It is critical to understand what can and cannot be controlled, and focus on addressing and fixing what is possible. In many cases, this can help organizations avoid going down with the proverbial ship.

Heiko Specht is a Technology Expert at the Compuware APM Center of Excellence.

Hot Topics

APM

E-Commerce

The Latest

Escaping Pilot Purgatory: How AI Becomes an Operational Advantage

May 04, 2026

In live financial environments, capital markets software cannot pause for rebuilds. New capabilities are introduced as stacked technology layers to meet evolving demands while systems remain active, data keeps moving, and controls stay intact. AI is no exception, and its opportunities are significant: accelerated decision cycles, compressed manual workflows, and more effective operations across complex environments. The constraint isn't the models themselves, but the architectural environments they enter ...

Closing the Gap in Modern Tech and the Tools Meant to Monitor Them

May 01, 2026

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue ...

MEAN TIME TO INSIGHT Podcast - Episode 23: NetOps Labor Shortage

April 30, 2026

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ...

Why FinOps Rewrote Its Mission and What It Signals for Technology Management

April 29, 2026

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Clearing the Path to AI: Why Vendor Consolidation Matters Now

April 28, 2026

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

China's Web Outage: The Latest Earthquake to Rock the Internet

February 18, 2014

Learn more about Dynatrace

A Look Inside the Outage

Lessons Learned

In this era of increased interdependency, what can an organization do to better protect and insulate its web performance?

Monitoring third-party services in production is also important in order to validate SLAs, but also to identify third-party performance issues as they occur and take appropriate action.

Conclusion

Heiko Specht is a Technology Expert at the Compuware APM Center of Excellence.

Hot Topics

APM

E-Commerce

The Latest

Escaping Pilot Purgatory: How AI Becomes an Operational Advantage

May 04, 2026

Closing the Gap in Modern Tech and the Tools Meant to Monitor Them

May 01, 2026

MEAN TIME TO INSIGHT Podcast - Episode 23: NetOps Labor Shortage

April 30, 2026

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ...

Why FinOps Rewrote Its Mission and What It Signals for Technology Management

April 29, 2026

Clearing the Path to AI: Why Vendor Consolidation Matters Now

April 28, 2026

Featured Webinar

Featured Free Trial

Featured Free Trial

Featured Webinar

Featured Free Trial

Featured White Paper

Featured White Paper

Featured Webinar

Featured Webinar

Featured Webinar

Featured White Paper

Featured Report

Featured Webinar

Featured Webinar

Featured Free Tool

Featured Webinar

Featured Webinar

Featured White Paper

Featured Webinar

Featured Webinar

Featured Webinar

Featured White Paper

Featured Free Trial

Featured Webinar

Featured White Paper

Featured Webinar

Featured eBook

Featured Free Trial

Featured White Paper

Featured eBook

Featured Webinar

Featured White Paper

Featured Webinar

Featured Webinar

Featured Webinar

Featured White Paper

Featured Webinar

Featured eBook

Featured Webinar

Featured eBook

Featured Webinar

Featured Webinar

Featured Free Tool

Featured White Paper

Featured Webinar

Featured Webinar

Featured eBook

Featured Webinar

Featured White Paper

Featured White Paper

Featured Webinar

Featured Free Trial

Featured Webinar

Featured White Paper

Featured Webinar

Featured Webinar

Featured Webinar

Featured Report

Featured White Paper

Featured Webinar

Featured White Paper

Featured eBook

Featured Webinar

Featured Report

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured White Paper