Skip to main content

Are SDKs Crashing Your Apps? Adopt Defensive Programming to Protect Against Outages

James Smith
SmartBear

In summer 2020, changes to a Facebook API triggered a series of major mobile app crashes worldwide. Popular iOS apps including Spotify, Pinterest, TikTok, Venmo, Tinder and DoorDash, among others, failed immediately upon being opened, leaving millions of users without access to their favorite services. However, the API wasn't at fault, it was actually Facebook's iOS software development kit (SDK) that was responsible for the crash. The updated API simply exposed users to an existing (and until then, hidden) bug in Facebook's SDK that prevented apps from being able to authenticate and open.

Mobile apps rely heavily on SDKs from major tech platforms such as Google, Microsoft, Apple and Facebook. For instance, the majority of leading consumer apps have some kind of Facebook integration, such as "Log in with Facebook" or "Share on Facebook" features. These integrations typically go even further than just login or sharing features — developers also connect apps to Facebook to manage how those apps are advertised on the platform and view detailed audience data to optimize those ads. With all these links, consumer apps tend to be highly integrated with the Facebook SDK. As a result, any bug in that SDK can cause a total outage for these apps.

Several weeks before the Facebook SDK mishap, a similar situation unfolded involving the Google Maps SDK. Ridesharing and delivery apps are highly integrated with the Google Maps SDK to leverage its mapping capabilities. Due to a bug in the SDK, prominent apps like Lyft and GrubHub experienced significant outages across the globe.

Incidents like these two outages create a nightmare scenario for the companies whose apps were impacted. Especially since consumers today have high expectations for mobile app performance and little tolerance for unstable apps. When an app repeatedly fails to launch, users become much more likely to delete that app from their device and will possibly never download it again. For major consumer apps with massive user bases like Spotify or GrubHub, these app crashes can lead to millions of dollars in lost revenue.

In cases like these, an app team's first instinct is to look internally. Software engineers are used to their own coding errors causing crashes, so when something goes wrong, they'll first comb through their own code to identify the bug. This is a long and challenging process, especially for apps that have many different engineering teams working in silos. When an external SDK is the cause of the problem, these teams will fruitlessly spend hours trying and failing to locate the bug.

Engineers must realize that software bugs in external SDKs cause app crashes more often than MANY expect. When an app outage impacts a broad segment of users, in addition to inspecting their own code, these teams must also consider early on that an SDK could be responsible. Understanding this can save valuable time and resources and help get the app functioning again faster.

More importantly, engineers must also take proactive measures to protect their users' experience. Adopting defensive programming strategies can prevent SDK bugs from crashing their apps. Defensive programming is an approach to software development that anticipates and mitigates the impact of failing SDKs on apps. With this method, engineers incorporate capabilities that allow their apps to automatically change how they handle malformed data from outside servers.

Feature flagging is a key to defensive programming. One common technique uses feature flags to remotely turn on or off SDKs (also known as a "kill switch" capability). In the case of the faulty Facebook SDK, this would have allowed engineers to quickly turn off the malfunctioning SDK. With the SDK off, apps would have simply skipped the Facebook initialization during launch, ensuring they would have opened and ran properly. Similarly, engineers could have also used feature flags to customize apps to revert to a default setting when Facebook's server responded with junk data. Either way, the apps would have opened and ran properly.

A/B testing is also an important component of defensive programming. Engineers can vet SDKs using A/B test flags to understand how an SDK impacts an app's stability. If the SDK appears to cause an app to crash often, then it probably shouldn't be used. With this sort of insight, engineers can determine whether they should integrate a certain SDK with an app.

Good SDKs should never crash apps, but the reality is that they occasionally do and the user experience can suffer tremendously when that happens. To make matters worse, customers are going to blame the apps rather than the tech giants responsible for the SDKs. Engineers must adopt defensive programming to guard apps against SDK bugs, keep users happy and support continued revenue growth.

James Smith is SVP of the Bugsnag Product Group at SmartBear

Hot Topics

The Latest

Significant improvements in operational resilience, more effective use of automation and faster time to market are driving optimism about IT spending in 2025, with a majority of leaders expecting their budgets to increase year-over-year, according to the 2025 State of Digital Operations Report from PagerDuty ...

Image
PagerDuty

Are they simply number crunchers confined to back-office support, or are they the strategic influencers shaping the future of your enterprise? The reality is that data analysts are far more the latter. In fact, 94% of analysts agree their role is pivotal to making high-level business decisions, proving that they are becoming indispensable partners in shaping strategy ...

Today's enterprises exist in rapidly growing, complex IT landscapes that can inadvertently create silos and lead to the accumulation of disparate tools. To successfully manage such growth, these organizations must realize the requisite shift in corporate culture and workflow management needed to build trust in new technologies. This is particularly true in cases where enterprises are turning to automation and autonomic IT to offload the burden from IT professionals. This interplay between technology and culture is crucial in guiding teams using AIOps and observability solutions to proactively manage operations and transition toward a machine-driven IT ecosystem ...

Gartner identified the top data and analytics (D&A) trends for 2025 that are driving the emergence of a wide range of challenges, including organizational and human issues ...

Traditional network monitoring, while valuable, often falls short in providing the context needed to truly understand network behavior. This is where observability shines. In this blog, we'll compare and contrast traditional network monitoring and observability — highlighting the benefits of this evolving approach ...

A recent Rocket Software and Foundry study found that just 28% of organizations fully leverage their mainframe data, a concerning statistic given its critical role in powering AI models, predictive analytics, and informed decision-making ...

What kind of ROI is your organization seeing on its technology investments? If your answer is "it's complicated," you're not alone. According to a recent study conducted by Apptio ... there is a disconnect between enterprise technology spending and organizations' ability to measure the results ...

In today’s data and AI driven world, enterprises across industries are utilizing AI to invent new business models, reimagine business and achieve efficiency in operations. However, enterprises may face challenges like flawed or biased AI decisions, sensitive data breaches and rising regulatory risks ...

In MEAN TIME TO INSIGHT Episode 12, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses purchasing new network observability solutions.... 

There's an image problem with mobile app security. While it's critical for highly regulated industries like financial services, it is often overlooked in others. This usually comes down to development priorities, which typically fall into three categories: user experience, app performance, and app security. When dealing with finite resources such as time, shifting priorities, and team skill sets, engineering teams often have to prioritize one over the others. Usually, security is the odd man out ...

Image
Guardsquare

Are SDKs Crashing Your Apps? Adopt Defensive Programming to Protect Against Outages

James Smith
SmartBear

In summer 2020, changes to a Facebook API triggered a series of major mobile app crashes worldwide. Popular iOS apps including Spotify, Pinterest, TikTok, Venmo, Tinder and DoorDash, among others, failed immediately upon being opened, leaving millions of users without access to their favorite services. However, the API wasn't at fault, it was actually Facebook's iOS software development kit (SDK) that was responsible for the crash. The updated API simply exposed users to an existing (and until then, hidden) bug in Facebook's SDK that prevented apps from being able to authenticate and open.

Mobile apps rely heavily on SDKs from major tech platforms such as Google, Microsoft, Apple and Facebook. For instance, the majority of leading consumer apps have some kind of Facebook integration, such as "Log in with Facebook" or "Share on Facebook" features. These integrations typically go even further than just login or sharing features — developers also connect apps to Facebook to manage how those apps are advertised on the platform and view detailed audience data to optimize those ads. With all these links, consumer apps tend to be highly integrated with the Facebook SDK. As a result, any bug in that SDK can cause a total outage for these apps.

Several weeks before the Facebook SDK mishap, a similar situation unfolded involving the Google Maps SDK. Ridesharing and delivery apps are highly integrated with the Google Maps SDK to leverage its mapping capabilities. Due to a bug in the SDK, prominent apps like Lyft and GrubHub experienced significant outages across the globe.

Incidents like these two outages create a nightmare scenario for the companies whose apps were impacted. Especially since consumers today have high expectations for mobile app performance and little tolerance for unstable apps. When an app repeatedly fails to launch, users become much more likely to delete that app from their device and will possibly never download it again. For major consumer apps with massive user bases like Spotify or GrubHub, these app crashes can lead to millions of dollars in lost revenue.

In cases like these, an app team's first instinct is to look internally. Software engineers are used to their own coding errors causing crashes, so when something goes wrong, they'll first comb through their own code to identify the bug. This is a long and challenging process, especially for apps that have many different engineering teams working in silos. When an external SDK is the cause of the problem, these teams will fruitlessly spend hours trying and failing to locate the bug.

Engineers must realize that software bugs in external SDKs cause app crashes more often than MANY expect. When an app outage impacts a broad segment of users, in addition to inspecting their own code, these teams must also consider early on that an SDK could be responsible. Understanding this can save valuable time and resources and help get the app functioning again faster.

More importantly, engineers must also take proactive measures to protect their users' experience. Adopting defensive programming strategies can prevent SDK bugs from crashing their apps. Defensive programming is an approach to software development that anticipates and mitigates the impact of failing SDKs on apps. With this method, engineers incorporate capabilities that allow their apps to automatically change how they handle malformed data from outside servers.

Feature flagging is a key to defensive programming. One common technique uses feature flags to remotely turn on or off SDKs (also known as a "kill switch" capability). In the case of the faulty Facebook SDK, this would have allowed engineers to quickly turn off the malfunctioning SDK. With the SDK off, apps would have simply skipped the Facebook initialization during launch, ensuring they would have opened and ran properly. Similarly, engineers could have also used feature flags to customize apps to revert to a default setting when Facebook's server responded with junk data. Either way, the apps would have opened and ran properly.

A/B testing is also an important component of defensive programming. Engineers can vet SDKs using A/B test flags to understand how an SDK impacts an app's stability. If the SDK appears to cause an app to crash often, then it probably shouldn't be used. With this sort of insight, engineers can determine whether they should integrate a certain SDK with an app.

Good SDKs should never crash apps, but the reality is that they occasionally do and the user experience can suffer tremendously when that happens. To make matters worse, customers are going to blame the apps rather than the tech giants responsible for the SDKs. Engineers must adopt defensive programming to guard apps against SDK bugs, keep users happy and support continued revenue growth.

James Smith is SVP of the Bugsnag Product Group at SmartBear

Hot Topics

The Latest

Significant improvements in operational resilience, more effective use of automation and faster time to market are driving optimism about IT spending in 2025, with a majority of leaders expecting their budgets to increase year-over-year, according to the 2025 State of Digital Operations Report from PagerDuty ...

Image
PagerDuty

Are they simply number crunchers confined to back-office support, or are they the strategic influencers shaping the future of your enterprise? The reality is that data analysts are far more the latter. In fact, 94% of analysts agree their role is pivotal to making high-level business decisions, proving that they are becoming indispensable partners in shaping strategy ...

Today's enterprises exist in rapidly growing, complex IT landscapes that can inadvertently create silos and lead to the accumulation of disparate tools. To successfully manage such growth, these organizations must realize the requisite shift in corporate culture and workflow management needed to build trust in new technologies. This is particularly true in cases where enterprises are turning to automation and autonomic IT to offload the burden from IT professionals. This interplay between technology and culture is crucial in guiding teams using AIOps and observability solutions to proactively manage operations and transition toward a machine-driven IT ecosystem ...

Gartner identified the top data and analytics (D&A) trends for 2025 that are driving the emergence of a wide range of challenges, including organizational and human issues ...

Traditional network monitoring, while valuable, often falls short in providing the context needed to truly understand network behavior. This is where observability shines. In this blog, we'll compare and contrast traditional network monitoring and observability — highlighting the benefits of this evolving approach ...

A recent Rocket Software and Foundry study found that just 28% of organizations fully leverage their mainframe data, a concerning statistic given its critical role in powering AI models, predictive analytics, and informed decision-making ...

What kind of ROI is your organization seeing on its technology investments? If your answer is "it's complicated," you're not alone. According to a recent study conducted by Apptio ... there is a disconnect between enterprise technology spending and organizations' ability to measure the results ...

In today’s data and AI driven world, enterprises across industries are utilizing AI to invent new business models, reimagine business and achieve efficiency in operations. However, enterprises may face challenges like flawed or biased AI decisions, sensitive data breaches and rising regulatory risks ...

In MEAN TIME TO INSIGHT Episode 12, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses purchasing new network observability solutions.... 

There's an image problem with mobile app security. While it's critical for highly regulated industries like financial services, it is often overlooked in others. This usually comes down to development priorities, which typically fall into three categories: user experience, app performance, and app security. When dealing with finite resources such as time, shifting priorities, and team skill sets, engineering teams often have to prioritize one over the others. Usually, security is the odd man out ...

Image
Guardsquare