Start with Part One of this article: Down Goes the Internet (Again) – Are You Ready?
In this era of unprecedented complexity, it's virtually impossible for a modern website to eliminate all the risk associated with using third parties. However, there are proactive strategies an organization can implement to better manage and minimize their risk. These include:
1. Proactively monitor speed and availability
Proactively monitor the speed and availability of websites, web applications and mobile sites from the true end-user perspective.
Today, there are so many elements out there on the web that stand between your data center and your users, including not just third-party services, but content delivery networks (CDNs), local and regional ISPs, mobile carrier networks and browsers, for example. Measuring performance from your data center alone is insufficient – unless, of course, your users live in your data center, which is highly unlikely.
The true browser-based perspective is the only place where you can accurately gauge your user's experience at the end of an extremely long and complicated technology path known as the application delivery chain. Today's new generation application performance management (APM) solutions are based on this true user perspective.
2. Monitor all transactions
Monitor all transactions, 24x7 along the complete application delivery chain. Sampling is not a sufficient means of gauging performance, of course, because a major performance issue may very well occur outside your testing interval – think of the Amazon EC2 outage that impacted Netflix on Christmas day last year!
Due to the unpredictability of major service outages, you need to be monitoring all transactions around the clock, to identify all performance aberrations and their root causes – both within and beyond the firewall – quickly and accurately, and get ahead of them.
3. Baseline and uphold performance-focused SLAs
Service-level agreements (SLAs) promising a certain level of availability on the part of a third-party service provider mean very little when it comes to performance.
For example, just because your cloud service provider's servers are up and running does not mean your users are experiencing an acceptable level of speed and reliability. Remember, third party services of all types are serving thousands of customers like you around the globe, and a spike in another customer's traffic may impact you.
With little insight into third party service providers' capacity planning decisions, you need to monitor performance levels yourself to ensure they don't drop off, and validate these against performance-focused SLAs. To get a sense of how a third party service provider may be impacting your overall performance, it can be helpful to compare your site's speed and availability before the third party service is added, to afterwards.
4. Utilize industry resources
Utilize industry resources to better assess if the source of a performance problem lies with you or one of your third-party service providers, as well as the likely performance impact on your customers.
These services may not prevent third party service outages from happening, but they can help companies better understand the source of performance problems so they can get in front of them more confidently and efficiently.
Conclusion
The reality is: the delivery chain underlying the services we often take for granted is so tenuous, that it's a marvel they don't break down more often. While outages may be inevitable, this does not make them any less costly or damaging to a company's reputation and revenues.
For example, on August 19, Amazon's North American retail site went down for about 49 minutes, with visitors greeted with the word “oops.” No explanation was given, but one estimate by Forbes put the cost to Amazon at nearly $2 million in sales.
But it's not just the “big guys” like Amazon that you need to focus on. The fact is that little storms are happening on the internet all the time, and you need to be prepared for them. When it comes to surviving and thriving in the age of increasing web complexity, an ounce of prevention can be worth a pound of cure. By taking advantage of several relatively simple and inexpensive approaches, organizations can better exploit all that third party services have to offer, while reducing the inherent risks.
Klaus Enzenhofer is Technology Strategist for Compuware APM’s Center of Excellence.
The Latest
This blog presents the case for a radical new approach to basic information technology (IT) education. This conclusion is based on a study of courses and other forms of IT education which purport to cover IT "fundamentals" ...
To achieve maximum availability, IT leaders must employ domain-agnostic solutions that identify and escalate issues across all telemetry points. These technologies, which we refer to as Artificial Intelligence for IT Operations, create convergence — in other words, they provide IT and DevOps teams with the full picture of event management and downtime ...
APMdigest and leading IT research firm Enterprise Management Associates (EMA) are partnering to bring you the EMA-APMdigest Podcast, a new podcast focused on the latest technologies impacting IT Operations. In Episode 2 - Part 1 Pete Goldin, Editor and Publisher of APMdigest, discusses Network Observability with Shamus McGillicuddy, Vice President of Research, Network Infrastructure and Operations, at EMA ...
CIOs have stepped into the role of digital leader and strategic advisor, according to the 2023 Global CIO Survey from Logicalis ...
Synthetic monitoring is crucial to deploy code with confidence as catching bugs with E2E tests on staging is becoming increasingly difficult. It isn't trivial to provide realistic staging systems, especially because today's apps are intertwined with many third-party APIs ...
Recent EMA field research found that ServiceOps is either an active effort or a formal initiative in 78% of the organizations represented by a global panel of 400+ IT leaders. It is relatively early but gaining momentum across industries and organizations of all sizes globally ...
Managing availability and performance within SAP environments has long been a challenge for IT teams. But as IT environments grow more complex and dynamic, and the speed of innovation in almost every industry continues to accelerate, this situation is becoming a whole lot worse ...
Harnessing the power of network-derived intelligence and insights is critical in detecting today's increasingly sophisticated security threats across hybrid and multi-cloud infrastructure, according to a new research study from IDC ...
Recent research suggests that many organizations are paying for more software than they need. If organizations are looking to reduce IT spend, leaders should take a closer look at the tools being offered to employees, as not all software is essential ...
Organizations are challenged by tool sprawl and data source overload, according to the Grafana Labs Observability Survey 2023, with 52% of respondents reporting that their companies use 6 or more observability tools, including 11% that use 16 or more.