Skip to main content

Are SDKs Crashing Your Apps? Adopt Defensive Programming to Protect Against Outages

James Smith
SmartBear

In summer 2020, changes to a Facebook API triggered a series of major mobile app crashes worldwide. Popular iOS apps including Spotify, Pinterest, TikTok, Venmo, Tinder and DoorDash, among others, failed immediately upon being opened, leaving millions of users without access to their favorite services. However, the API wasn't at fault, it was actually Facebook's iOS software development kit (SDK) that was responsible for the crash. The updated API simply exposed users to an existing (and until then, hidden) bug in Facebook's SDK that prevented apps from being able to authenticate and open.

Mobile apps rely heavily on SDKs from major tech platforms such as Google, Microsoft, Apple and Facebook. For instance, the majority of leading consumer apps have some kind of Facebook integration, such as "Log in with Facebook" or "Share on Facebook" features. These integrations typically go even further than just login or sharing features — developers also connect apps to Facebook to manage how those apps are advertised on the platform and view detailed audience data to optimize those ads. With all these links, consumer apps tend to be highly integrated with the Facebook SDK. As a result, any bug in that SDK can cause a total outage for these apps.

Several weeks before the Facebook SDK mishap, a similar situation unfolded involving the Google Maps SDK. Ridesharing and delivery apps are highly integrated with the Google Maps SDK to leverage its mapping capabilities. Due to a bug in the SDK, prominent apps like Lyft and GrubHub experienced significant outages across the globe.

Incidents like these two outages create a nightmare scenario for the companies whose apps were impacted. Especially since consumers today have high expectations for mobile app performance and little tolerance for unstable apps. When an app repeatedly fails to launch, users become much more likely to delete that app from their device and will possibly never download it again. For major consumer apps with massive user bases like Spotify or GrubHub, these app crashes can lead to millions of dollars in lost revenue.

In cases like these, an app team's first instinct is to look internally. Software engineers are used to their own coding errors causing crashes, so when something goes wrong, they'll first comb through their own code to identify the bug. This is a long and challenging process, especially for apps that have many different engineering teams working in silos. When an external SDK is the cause of the problem, these teams will fruitlessly spend hours trying and failing to locate the bug.

Engineers must realize that software bugs in external SDKs cause app crashes more often than MANY expect. When an app outage impacts a broad segment of users, in addition to inspecting their own code, these teams must also consider early on that an SDK could be responsible. Understanding this can save valuable time and resources and help get the app functioning again faster.

More importantly, engineers must also take proactive measures to protect their users' experience. Adopting defensive programming strategies can prevent SDK bugs from crashing their apps. Defensive programming is an approach to software development that anticipates and mitigates the impact of failing SDKs on apps. With this method, engineers incorporate capabilities that allow their apps to automatically change how they handle malformed data from outside servers.

Feature flagging is a key to defensive programming. One common technique uses feature flags to remotely turn on or off SDKs (also known as a "kill switch" capability). In the case of the faulty Facebook SDK, this would have allowed engineers to quickly turn off the malfunctioning SDK. With the SDK off, apps would have simply skipped the Facebook initialization during launch, ensuring they would have opened and ran properly. Similarly, engineers could have also used feature flags to customize apps to revert to a default setting when Facebook's server responded with junk data. Either way, the apps would have opened and ran properly.

A/B testing is also an important component of defensive programming. Engineers can vet SDKs using A/B test flags to understand how an SDK impacts an app's stability. If the SDK appears to cause an app to crash often, then it probably shouldn't be used. With this sort of insight, engineers can determine whether they should integrate a certain SDK with an app.

Good SDKs should never crash apps, but the reality is that they occasionally do and the user experience can suffer tremendously when that happens. To make matters worse, customers are going to blame the apps rather than the tech giants responsible for the SDKs. Engineers must adopt defensive programming to guard apps against SDK bugs, keep users happy and support continued revenue growth.

James Smith is SVP of the Bugsnag Product Group at SmartBear

Hot Topics

The Latest

Payment system failures are putting $44.4 billion in US retail and hospitality sales at risk each year, underscoring how quickly disruption can derail day-to-day trading, according to research conducted by Dynatrace ... The findings show that payment failures are no longer isolated incidents, but part of a recurring operational challenge that disrupts service, damages customer trust, and negatively impacts revenue ...

For years, the success of DevOps has been measured by how much manual work teams can automate ... I believe that in 2026, the definition of DevOps success is going to expand significantly. The era of automation is giving way to the era of intelligent delivery, in which AI doesn't just accelerate pipelines, it understands them. With open observability connecting signals end-to-end across those tools, teams can build closed-loop systems that don't just move faster, but learn, adapt, and take action autonomously with confidence ...

The conversation around AI in the enterprise has officially shifted from "if" to "how fast." But according to the State of Network Operations 2026 report from Broadcom, most organizations are unknowingly building their AI strategies on sand. The data is clear: CIOs and network teams are putting the cart before the horse. AI cannot improve what the network cannot see, predict issues without historical context, automate processes that aren't standardized, or recommend fixes when the underlying telemetry is incomplete. If AI is the brain, then network observability is the nervous system that makes intelligent action possible ...

SolarWinds data shows that one in three DBAs are contemplating leaving their positions — a striking indicator of workforce pressure in this role. This is likely due to the technical and interpersonal frustrations plaguing today's DBAs. Hybrid IT environments provide widespread organizational benefits but also present growing complexity. Simultaneously, AI presents a paradox of benefits and pain points ...

Over the last year, we've seen enterprises stop treating AI as “special projects.” It is no longer confined to pilots or side experiments. AI is now embedded in production, shaping decisions, powering new business models, and changing how employees and customers experience work every day. So, the debate of "should we adopt AI" is settled. The real question is how quickly and how deeply it can be applied ...

In MEAN TIME TO INSIGHT Episode 20, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA presents his 2026 NetOps predictions ... 

Today, technology buyers don't suffer from a lack of information but an abundance of it. They need a trusted partner to help them navigate this information environment ...

My latest title for O'Reilly, The Rise of Logical Data Management, was an eye-opener for me. I'd never heard of "logical data management," even though it's been around for several years, but it makes some extraordinary promises, like the ability to manage data without having to first move it into a consolidated repository, which changes everything. Now, with the demands of AI and other modern use cases, logical data management is on the rise, so it's "new" to many. Here, I'd like to introduce you to it and explain how it works ...

APMdigest's Predictions Series continues with 2026 Data Center Predictions — industry experts offer predictions on how data centers will evolve and impact business in 2026 ...

APMdigest's Predictions Series continues with 2026 DataOps Predictions — industry experts offer predictions on how DataOps and related technologies will evolve and impact business in 2026. Part 2 covers data and data platforms ...

Are SDKs Crashing Your Apps? Adopt Defensive Programming to Protect Against Outages

James Smith
SmartBear

In summer 2020, changes to a Facebook API triggered a series of major mobile app crashes worldwide. Popular iOS apps including Spotify, Pinterest, TikTok, Venmo, Tinder and DoorDash, among others, failed immediately upon being opened, leaving millions of users without access to their favorite services. However, the API wasn't at fault, it was actually Facebook's iOS software development kit (SDK) that was responsible for the crash. The updated API simply exposed users to an existing (and until then, hidden) bug in Facebook's SDK that prevented apps from being able to authenticate and open.

Mobile apps rely heavily on SDKs from major tech platforms such as Google, Microsoft, Apple and Facebook. For instance, the majority of leading consumer apps have some kind of Facebook integration, such as "Log in with Facebook" or "Share on Facebook" features. These integrations typically go even further than just login or sharing features — developers also connect apps to Facebook to manage how those apps are advertised on the platform and view detailed audience data to optimize those ads. With all these links, consumer apps tend to be highly integrated with the Facebook SDK. As a result, any bug in that SDK can cause a total outage for these apps.

Several weeks before the Facebook SDK mishap, a similar situation unfolded involving the Google Maps SDK. Ridesharing and delivery apps are highly integrated with the Google Maps SDK to leverage its mapping capabilities. Due to a bug in the SDK, prominent apps like Lyft and GrubHub experienced significant outages across the globe.

Incidents like these two outages create a nightmare scenario for the companies whose apps were impacted. Especially since consumers today have high expectations for mobile app performance and little tolerance for unstable apps. When an app repeatedly fails to launch, users become much more likely to delete that app from their device and will possibly never download it again. For major consumer apps with massive user bases like Spotify or GrubHub, these app crashes can lead to millions of dollars in lost revenue.

In cases like these, an app team's first instinct is to look internally. Software engineers are used to their own coding errors causing crashes, so when something goes wrong, they'll first comb through their own code to identify the bug. This is a long and challenging process, especially for apps that have many different engineering teams working in silos. When an external SDK is the cause of the problem, these teams will fruitlessly spend hours trying and failing to locate the bug.

Engineers must realize that software bugs in external SDKs cause app crashes more often than MANY expect. When an app outage impacts a broad segment of users, in addition to inspecting their own code, these teams must also consider early on that an SDK could be responsible. Understanding this can save valuable time and resources and help get the app functioning again faster.

More importantly, engineers must also take proactive measures to protect their users' experience. Adopting defensive programming strategies can prevent SDK bugs from crashing their apps. Defensive programming is an approach to software development that anticipates and mitigates the impact of failing SDKs on apps. With this method, engineers incorporate capabilities that allow their apps to automatically change how they handle malformed data from outside servers.

Feature flagging is a key to defensive programming. One common technique uses feature flags to remotely turn on or off SDKs (also known as a "kill switch" capability). In the case of the faulty Facebook SDK, this would have allowed engineers to quickly turn off the malfunctioning SDK. With the SDK off, apps would have simply skipped the Facebook initialization during launch, ensuring they would have opened and ran properly. Similarly, engineers could have also used feature flags to customize apps to revert to a default setting when Facebook's server responded with junk data. Either way, the apps would have opened and ran properly.

A/B testing is also an important component of defensive programming. Engineers can vet SDKs using A/B test flags to understand how an SDK impacts an app's stability. If the SDK appears to cause an app to crash often, then it probably shouldn't be used. With this sort of insight, engineers can determine whether they should integrate a certain SDK with an app.

Good SDKs should never crash apps, but the reality is that they occasionally do and the user experience can suffer tremendously when that happens. To make matters worse, customers are going to blame the apps rather than the tech giants responsible for the SDKs. Engineers must adopt defensive programming to guard apps against SDK bugs, keep users happy and support continued revenue growth.

James Smith is SVP of the Bugsnag Product Group at SmartBear

Hot Topics

The Latest

Payment system failures are putting $44.4 billion in US retail and hospitality sales at risk each year, underscoring how quickly disruption can derail day-to-day trading, according to research conducted by Dynatrace ... The findings show that payment failures are no longer isolated incidents, but part of a recurring operational challenge that disrupts service, damages customer trust, and negatively impacts revenue ...

For years, the success of DevOps has been measured by how much manual work teams can automate ... I believe that in 2026, the definition of DevOps success is going to expand significantly. The era of automation is giving way to the era of intelligent delivery, in which AI doesn't just accelerate pipelines, it understands them. With open observability connecting signals end-to-end across those tools, teams can build closed-loop systems that don't just move faster, but learn, adapt, and take action autonomously with confidence ...

The conversation around AI in the enterprise has officially shifted from "if" to "how fast." But according to the State of Network Operations 2026 report from Broadcom, most organizations are unknowingly building their AI strategies on sand. The data is clear: CIOs and network teams are putting the cart before the horse. AI cannot improve what the network cannot see, predict issues without historical context, automate processes that aren't standardized, or recommend fixes when the underlying telemetry is incomplete. If AI is the brain, then network observability is the nervous system that makes intelligent action possible ...

SolarWinds data shows that one in three DBAs are contemplating leaving their positions — a striking indicator of workforce pressure in this role. This is likely due to the technical and interpersonal frustrations plaguing today's DBAs. Hybrid IT environments provide widespread organizational benefits but also present growing complexity. Simultaneously, AI presents a paradox of benefits and pain points ...

Over the last year, we've seen enterprises stop treating AI as “special projects.” It is no longer confined to pilots or side experiments. AI is now embedded in production, shaping decisions, powering new business models, and changing how employees and customers experience work every day. So, the debate of "should we adopt AI" is settled. The real question is how quickly and how deeply it can be applied ...

In MEAN TIME TO INSIGHT Episode 20, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA presents his 2026 NetOps predictions ... 

Today, technology buyers don't suffer from a lack of information but an abundance of it. They need a trusted partner to help them navigate this information environment ...

My latest title for O'Reilly, The Rise of Logical Data Management, was an eye-opener for me. I'd never heard of "logical data management," even though it's been around for several years, but it makes some extraordinary promises, like the ability to manage data without having to first move it into a consolidated repository, which changes everything. Now, with the demands of AI and other modern use cases, logical data management is on the rise, so it's "new" to many. Here, I'd like to introduce you to it and explain how it works ...

APMdigest's Predictions Series continues with 2026 Data Center Predictions — industry experts offer predictions on how data centers will evolve and impact business in 2026 ...

APMdigest's Predictions Series continues with 2026 DataOps Predictions — industry experts offer predictions on how DataOps and related technologies will evolve and impact business in 2026. Part 2 covers data and data platforms ...