It's been all over the news the last few months. After two fatal crashes, Boeing was forced to ground its 737. The doomed model is now undergoing extensive testing to get it back into service and production. You can almost cut the anticipation with a knife. Wall Street, the airline industry, future passengers and the manufacturer itself all want to be able to rest knowing that all Boeing planes are back on the market.
In the interim, the manufacturer has taken a very serious hit. Its stock price plummeted. Consumer safety concerns hit at an all-time low. And it all boils down to a series of software problems, and it will take new and improved updates to get the models back into the sky.
The airline/aerospace industry isn't the first or the last to come face-to-face with software flaws. It's pervasive. The big question is who's next? Automotive? Retail banking? All are plausible. This is a line that no one wants to be first in.
Why does it continue to happen? And more importantly, how can be it be avoided?
Large organizations often tell stakeholders that even though all software goes through extensive testing, this type of thing “just happens.” The old saying “to err is human” is the scapegoat. But that is exactly the problem. While the human component of application development and testing won't go away, it can be eased and supplemented by far more efficient and automated methods to proactively determine software health and identify flaws.
Gaining insight into software health lends itself to knowing how secure applications are. A recent Software Intelligence Report from CAST found 28% of businesses rely on “instinct” or their architects to assess potential IT risks. However, being in the blind about software robustness can leave organizations vulnerable, so they need to understand where the weaknesses are before it's too late, using Software Intelligence to find the biggest threats.
Just like a doctor doesn't diagnose a broken arm without an x-ray, a business shouldn't rely on human assessments alone to diagnose software issues.
Routine Checks, Spot Fixes and Physicals
The good news is with a few tweaks software health assessments can become much more effective and preventative. This can be achieved by breaking up your software health checks into three categories: routine checks, spot fixes and physicals. With this strategy, weaknesses can be detected quickly especially if the software is scanned on a regular basis. This will help identify and catch the biggest issues.
For routine checks, which should occur monthly, the focus should be on removing more defects than were added, and identifying the most common defects and asking, “do we know how to avoid the obvious flaws?” Identifying what a bad practice is helps teach developers not just about weaknesses but how to avoid them. In addition, change velocity should be relatively constant. Software releases with massive changes in functionality tend to cause concern. Defect density should also never slide up.
Spot fixes are frequent but can tell you a lot about a specific problem. Trouble tickets provided by customers or users can let you know specifics such as did it crash, was it slow, did it lockup? Knowing a specific pain and developing a plan to treat it will create real data that can improve metrics and identify issues such performance against the defects in a module or method, machine reboots caused by memory leaks or security breaches. In addition, this data can be combined with cost and hour data to develop a better prediction on staffing and usage.
Finally, the annual physical. Look for trends in key data from the same point each year. For example, was there an increase in complexity? Is the application getting harder to maintain? Has the defect density increased/decreased? Are the lines of code or number of transactions increasing? This can signify less experienced coders and increases the risk for potential defects.
Application maintenance is the responsibility of every IT department but understanding software health – whether it's secure, efficient, resilient – is the most vital aspect to ensuring that even a minor update, doesn't cause a ripple effect on the whole organization and generate unintended consequences, like what happened to Boeing.
Better software intelligence processes to determine health can pre-warn a business about risk and these three checkups should be a part of maintaining every application over time. All of the data should also be captured in a software health dashboard that tracks progress and can provide a quick glance at health in terms of robustness, efficiency, security, changeability, transferability and quality. A dashboard not only gives fast facts about the evolution of the software, but it also can give insights to where you are at highest risk and providing trending analysis to benchmark over time.
All developers should remember that it's impossible to retrofit stability and trust into an application. It has to be designed and engineered in, or the erosion sets in and your business can jump the queue and become the next Boeing.
A growing need for process automation as a result of the confluence of digital transformation initiatives with the remote/hybrid work policies brought on by the pandemic was uncovered by an independent survey of over 500 IT Operations, DevOps, and Site Reliability Engineering (SRE) professionals commissioned by Transposit for its inaugural State of DevOps Automation Report ...
As the Covid-19 pandemic forces a global reset of how we gather and work, 60% of organizations are looking forward to increased spending in 2021 to deploy new technologies, according to the 14th annual State of the Network global study of enterprise networking and security challenges released by VIAVI Solutions ...
Complexity breaks correlation. Intelligence brings cohesion. This simple principle is what makes real-time asset intelligence a must-have for AIOps that is meant to diffuse complexity. To further create a context for the user, it is critical to understand service dependencies and correlate alerts across the stack to resolve incidents ...
We're all familiar with the process of QA within the software development cycle. Developers build a product and send it to QA engineers, who test and bless it before pushing it into the world. After release, a different team of SREs with their own toolset then monitor for issues and bugs. Now, a new level of customer expectations for speed and reliability have pushed businesses further toward delivering rapid product iterations and innovations to keep up with customer demands. This leaves little time to run the traditional development process ...
On Wednesday January 27, 2021, Microsoft Office 365 experienced an outage affected a number of its services with a prolonged outage affecting Exchange Online. Despite Microsoft indicating that it was just Exchange Online affected during this outage, some monitoring tools detected that Azure Active Directory and dependent services like SharePoint and OneDrive were also affected at the time. The outage information indicated a rollout and rollback but we wouldn't expect to see such a widescale outage and slowdown just affecting some of the schema unless everything had to be taken offline ...