Skip to main content

Are SDKs Crashing Your Apps? Adopt Defensive Programming to Protect Against Outages

James Smith
SmartBear

In summer 2020, changes to a Facebook API triggered a series of major mobile app crashes worldwide. Popular iOS apps including Spotify, Pinterest, TikTok, Venmo, Tinder and DoorDash, among others, failed immediately upon being opened, leaving millions of users without access to their favorite services. However, the API wasn't at fault, it was actually Facebook's iOS software development kit (SDK) that was responsible for the crash. The updated API simply exposed users to an existing (and until then, hidden) bug in Facebook's SDK that prevented apps from being able to authenticate and open.

Mobile apps rely heavily on SDKs from major tech platforms such as Google, Microsoft, Apple and Facebook. For instance, the majority of leading consumer apps have some kind of Facebook integration, such as "Log in with Facebook" or "Share on Facebook" features. These integrations typically go even further than just login or sharing features — developers also connect apps to Facebook to manage how those apps are advertised on the platform and view detailed audience data to optimize those ads. With all these links, consumer apps tend to be highly integrated with the Facebook SDK. As a result, any bug in that SDK can cause a total outage for these apps.

Several weeks before the Facebook SDK mishap, a similar situation unfolded involving the Google Maps SDK. Ridesharing and delivery apps are highly integrated with the Google Maps SDK to leverage its mapping capabilities. Due to a bug in the SDK, prominent apps like Lyft and GrubHub experienced significant outages across the globe.

Incidents like these two outages create a nightmare scenario for the companies whose apps were impacted. Especially since consumers today have high expectations for mobile app performance and little tolerance for unstable apps. When an app repeatedly fails to launch, users become much more likely to delete that app from their device and will possibly never download it again. For major consumer apps with massive user bases like Spotify or GrubHub, these app crashes can lead to millions of dollars in lost revenue.

In cases like these, an app team's first instinct is to look internally. Software engineers are used to their own coding errors causing crashes, so when something goes wrong, they'll first comb through their own code to identify the bug. This is a long and challenging process, especially for apps that have many different engineering teams working in silos. When an external SDK is the cause of the problem, these teams will fruitlessly spend hours trying and failing to locate the bug.

Engineers must realize that software bugs in external SDKs cause app crashes more often than MANY expect. When an app outage impacts a broad segment of users, in addition to inspecting their own code, these teams must also consider early on that an SDK could be responsible. Understanding this can save valuable time and resources and help get the app functioning again faster.

More importantly, engineers must also take proactive measures to protect their users' experience. Adopting defensive programming strategies can prevent SDK bugs from crashing their apps. Defensive programming is an approach to software development that anticipates and mitigates the impact of failing SDKs on apps. With this method, engineers incorporate capabilities that allow their apps to automatically change how they handle malformed data from outside servers.

Feature flagging is a key to defensive programming. One common technique uses feature flags to remotely turn on or off SDKs (also known as a "kill switch" capability). In the case of the faulty Facebook SDK, this would have allowed engineers to quickly turn off the malfunctioning SDK. With the SDK off, apps would have simply skipped the Facebook initialization during launch, ensuring they would have opened and ran properly. Similarly, engineers could have also used feature flags to customize apps to revert to a default setting when Facebook's server responded with junk data. Either way, the apps would have opened and ran properly.

A/B testing is also an important component of defensive programming. Engineers can vet SDKs using A/B test flags to understand how an SDK impacts an app's stability. If the SDK appears to cause an app to crash often, then it probably shouldn't be used. With this sort of insight, engineers can determine whether they should integrate a certain SDK with an app.

Good SDKs should never crash apps, but the reality is that they occasionally do and the user experience can suffer tremendously when that happens. To make matters worse, customers are going to blame the apps rather than the tech giants responsible for the SDKs. Engineers must adopt defensive programming to guard apps against SDK bugs, keep users happy and support continued revenue growth.

James Smith is SVP of the Bugsnag Product Group at SmartBear

Hot Topics

The Latest

AI is the catalyst for significant investment in data teams as enterprises require higher-quality data to power their AI applications, according to the State of Analytics Engineering Report from dbt Labs ...

Misaligned architecture can lead to business consequences, with 93% of respondents reporting negative outcomes such as service disruptions, high operational costs and security challenges ...

A Gartner analyst recently suggested that GenAI tools could create 25% time savings for network operational teams. Where might these time savings come from? How are GenAI tools helping NetOps teams today, and what other tasks might they take on in the future as models continue improving? In general, these savings come from automating or streamlining manual NetOps tasks ...

IT and line-of-business teams are increasingly aligned in their efforts to close the data gap and drive greater collaboration to alleviate IT bottlenecks and offload growing demands on IT teams, according to The 2025 Automation Benchmark Report: Insights from IT Leaders on Enterprise Automation & the Future of AI-Driven Businesses from Jitterbit ...

A large majority (86%) of data management and AI decision makers cite protecting data privacy as a top concern, with 76% of respondents citing ROI on data privacy and AI initiatives across their organization, according to a new Harris Poll from Collibra ...

According to Gartner, Inc. the following six trends will shape the future of cloud over the next four years, ultimately resulting in new ways of working that are digital in nature and transformative in impact ...

2020 was the equivalent of a wedding with a top-shelf open bar. As businesses scrambled to adjust to remote work, digital transformation accelerated at breakneck speed. New software categories emerged overnight. Tech stacks ballooned with all sorts of SaaS apps solving ALL the problems — often with little oversight or long-term integration planning, and yes frequently a lot of duplicated functionality ... But now the music's faded. The lights are on. Everyone from the CIO to the CFO is checking the bill. Welcome to the Great SaaS Hangover ...

Regardless of OpenShift being a scalable and flexible software, it can be a pain to monitor since complete visibility into the underlying operations is not guaranteed ... To effectively monitor an OpenShift environment, IT administrators should focus on these five key elements and their associated metrics ...

An overwhelming majority of IT leaders (95%) believe the upcoming wave of AI-powered digital transformation is set to be the most impactful and intensive seen thus far, according to The Science of Productivity: AI, Adoption, And Employee Experience, a new report from Nexthink ...

Overall outage frequency and the general level of reported severity continue to decline, according to the Outage Analysis 2025 from Uptime Institute. However, cyber security incidents are on the rise and often have severe, lasting impacts ...

Are SDKs Crashing Your Apps? Adopt Defensive Programming to Protect Against Outages

James Smith
SmartBear

In summer 2020, changes to a Facebook API triggered a series of major mobile app crashes worldwide. Popular iOS apps including Spotify, Pinterest, TikTok, Venmo, Tinder and DoorDash, among others, failed immediately upon being opened, leaving millions of users without access to their favorite services. However, the API wasn't at fault, it was actually Facebook's iOS software development kit (SDK) that was responsible for the crash. The updated API simply exposed users to an existing (and until then, hidden) bug in Facebook's SDK that prevented apps from being able to authenticate and open.

Mobile apps rely heavily on SDKs from major tech platforms such as Google, Microsoft, Apple and Facebook. For instance, the majority of leading consumer apps have some kind of Facebook integration, such as "Log in with Facebook" or "Share on Facebook" features. These integrations typically go even further than just login or sharing features — developers also connect apps to Facebook to manage how those apps are advertised on the platform and view detailed audience data to optimize those ads. With all these links, consumer apps tend to be highly integrated with the Facebook SDK. As a result, any bug in that SDK can cause a total outage for these apps.

Several weeks before the Facebook SDK mishap, a similar situation unfolded involving the Google Maps SDK. Ridesharing and delivery apps are highly integrated with the Google Maps SDK to leverage its mapping capabilities. Due to a bug in the SDK, prominent apps like Lyft and GrubHub experienced significant outages across the globe.

Incidents like these two outages create a nightmare scenario for the companies whose apps were impacted. Especially since consumers today have high expectations for mobile app performance and little tolerance for unstable apps. When an app repeatedly fails to launch, users become much more likely to delete that app from their device and will possibly never download it again. For major consumer apps with massive user bases like Spotify or GrubHub, these app crashes can lead to millions of dollars in lost revenue.

In cases like these, an app team's first instinct is to look internally. Software engineers are used to their own coding errors causing crashes, so when something goes wrong, they'll first comb through their own code to identify the bug. This is a long and challenging process, especially for apps that have many different engineering teams working in silos. When an external SDK is the cause of the problem, these teams will fruitlessly spend hours trying and failing to locate the bug.

Engineers must realize that software bugs in external SDKs cause app crashes more often than MANY expect. When an app outage impacts a broad segment of users, in addition to inspecting their own code, these teams must also consider early on that an SDK could be responsible. Understanding this can save valuable time and resources and help get the app functioning again faster.

More importantly, engineers must also take proactive measures to protect their users' experience. Adopting defensive programming strategies can prevent SDK bugs from crashing their apps. Defensive programming is an approach to software development that anticipates and mitigates the impact of failing SDKs on apps. With this method, engineers incorporate capabilities that allow their apps to automatically change how they handle malformed data from outside servers.

Feature flagging is a key to defensive programming. One common technique uses feature flags to remotely turn on or off SDKs (also known as a "kill switch" capability). In the case of the faulty Facebook SDK, this would have allowed engineers to quickly turn off the malfunctioning SDK. With the SDK off, apps would have simply skipped the Facebook initialization during launch, ensuring they would have opened and ran properly. Similarly, engineers could have also used feature flags to customize apps to revert to a default setting when Facebook's server responded with junk data. Either way, the apps would have opened and ran properly.

A/B testing is also an important component of defensive programming. Engineers can vet SDKs using A/B test flags to understand how an SDK impacts an app's stability. If the SDK appears to cause an app to crash often, then it probably shouldn't be used. With this sort of insight, engineers can determine whether they should integrate a certain SDK with an app.

Good SDKs should never crash apps, but the reality is that they occasionally do and the user experience can suffer tremendously when that happens. To make matters worse, customers are going to blame the apps rather than the tech giants responsible for the SDKs. Engineers must adopt defensive programming to guard apps against SDK bugs, keep users happy and support continued revenue growth.

James Smith is SVP of the Bugsnag Product Group at SmartBear

Hot Topics

The Latest

AI is the catalyst for significant investment in data teams as enterprises require higher-quality data to power their AI applications, according to the State of Analytics Engineering Report from dbt Labs ...

Misaligned architecture can lead to business consequences, with 93% of respondents reporting negative outcomes such as service disruptions, high operational costs and security challenges ...

A Gartner analyst recently suggested that GenAI tools could create 25% time savings for network operational teams. Where might these time savings come from? How are GenAI tools helping NetOps teams today, and what other tasks might they take on in the future as models continue improving? In general, these savings come from automating or streamlining manual NetOps tasks ...

IT and line-of-business teams are increasingly aligned in their efforts to close the data gap and drive greater collaboration to alleviate IT bottlenecks and offload growing demands on IT teams, according to The 2025 Automation Benchmark Report: Insights from IT Leaders on Enterprise Automation & the Future of AI-Driven Businesses from Jitterbit ...

A large majority (86%) of data management and AI decision makers cite protecting data privacy as a top concern, with 76% of respondents citing ROI on data privacy and AI initiatives across their organization, according to a new Harris Poll from Collibra ...

According to Gartner, Inc. the following six trends will shape the future of cloud over the next four years, ultimately resulting in new ways of working that are digital in nature and transformative in impact ...

2020 was the equivalent of a wedding with a top-shelf open bar. As businesses scrambled to adjust to remote work, digital transformation accelerated at breakneck speed. New software categories emerged overnight. Tech stacks ballooned with all sorts of SaaS apps solving ALL the problems — often with little oversight or long-term integration planning, and yes frequently a lot of duplicated functionality ... But now the music's faded. The lights are on. Everyone from the CIO to the CFO is checking the bill. Welcome to the Great SaaS Hangover ...

Regardless of OpenShift being a scalable and flexible software, it can be a pain to monitor since complete visibility into the underlying operations is not guaranteed ... To effectively monitor an OpenShift environment, IT administrators should focus on these five key elements and their associated metrics ...

An overwhelming majority of IT leaders (95%) believe the upcoming wave of AI-powered digital transformation is set to be the most impactful and intensive seen thus far, according to The Science of Productivity: AI, Adoption, And Employee Experience, a new report from Nexthink ...

Overall outage frequency and the general level of reported severity continue to decline, according to the Outage Analysis 2025 from Uptime Institute. However, cyber security incidents are on the rise and often have severe, lasting impacts ...