As Digital Transformation Prevails, Automation Remains a Top Priority for DevOps, ITOps and SRE Teams
June 27, 2022

Jessica Abelson

Share this

Hybrid work adoption and the accelerated pace of digital transformation are driving an increasing need for automation and site reliability engineering (SRE) practices, according to new research.

In a new survey collected from 1,046 engineering, IT Operations, DevOps and site reliability engineering professionals in the United States with the role of VP, Director, Manager or individual contributor at organizations with over 300 employees, almost half of respondents (48.2%) said automation is a way to decrease Mean Time to Resolution/Repair (MTTR) and improve service management.

The second annual State of DevOps Automation Report, commissioned by Transposit also revealed close to sixty percent of organizations are losing up to half a million dollars per hour to downtime, a critical issue that can be mitigated with better automation and collaboration.

Organizations Still Lack Full Integration of Incident Response Tools

With 90.2% of organizations reporting an increased focus on digital transformation over the past year, paired with the persistence of hybrid and remote work, almost three-quarters (73.4%) of operations teams have expanded their tech stack. However, when asked how well integrated the various tools used during incident response are, only one quarter (24.7%) said all of their tools are integrated through one tool or platform. This means the vast majority (75.3%) don’t have full integration, leaving teams at risk of slow issue detection and analysis and a decrease in overall quality of service reliability and customer experience.

Broader deployment of automation has led developers to recognize that it’s key to reducing downtime and increasing resolution. This was seen by 3 in 4 organizations that implemented a continuous workflow to incident response for service management after adopting a hybrid workforce model.

Manual Processes Are Outdated and Lead to Higher Cost of Downtime and Service Incident Volume

The survey also found that more than a third (39.7%) of organizations had an increased cost of downtime during the last year (March 2021 to now). In fact, 58.2% reported that downtime (i.e., application outages, service degradation) cost their organization up to $499,999 per hour on average. Of those who reported an increase in the amount of time it takes to resolve incidents, 45.2% said it was due to a lack of unified communication with teammates (people are collaborating using disparate tools).

"Organizations need to deliver innovation faster and more efficiently than ever before. However, too many SRE, ITOps and DevOps teams are wasting time on disconnected, manual processes and playing a reactive game of whack-a-mole as they try to keep applications running," said Divanny Lamas, CEO of Transposit.

Operations teams are experiencing challenges while trying to solve incidents, including difficulties reaching people with specialized knowledge, inadequate support from collaboration methods and tools and lack of automation. When asked if they have observed any change in the frequency of service incidents that have affected their customers over the course of the last year (March 2021 to now), 62.9% of respondents reported an increase. Of those who said there was an increase in service incidents, respondents said the top reasons why this happened are digital transformation (60.7%), rolling out of new products or product updates (55.1%), methods and tools for collaboration did not adequately support their remote team (49.3%) and organizational change including team member churn, influx of new team members, and M&A activity (45.4%).

The Key to Faster Resolution of Incidents and Less Downtime: SRE Practices Combined with Automation

The rising demand for site reliability engineering is clear, as 75.6% of respondents said there has been an increased focus on SRE practices in their organization in the past 12 months, and of those, 35.1% plan to expand SRE efforts in 2022. Additionally, 65.1% of respondents plan to hire site reliability engineers in the next 12 months.

The need for automation tools is evident in the SRE roles to complement organizations’ increased focus on site reliability practices; 42.3% of SREs said the current level of automation is not meeting their organization’s needs and they are actively pursuing a new solution to solve for this shortage.

SREs are still dealing with cumbersome and tedious processes, despite the increased demand for SRE practices. Over half of SREs (56.5%) reported they still manually enter data into an ITSM system or other system or record to keep track of actions that were taken by humans during the resolution of an incident.

To scale, organizations need to implement automation technology to rid teams of these time-consuming manual processes. This is underlined by the fact that a full 100% of the respondents with a VP/Director/Manager SRE title who cited a decrease or no change in service incidents said it was because their organization implemented automation technology to help reduce the number of service incidents. Respondents also said better documentation, process and availability of data during incidents would have the most impact on MTTR, downtime and quality of service reliability.

As seen in the survey, organizations' approaches to automation differ. A majority (63%) responded that their approach to automation was incremental automation, in which they begin by codifying processes and work up to more advanced, fully automated scenarios. When asked whether automation should let humans use their judgment at critical decision points to be more reliable and effective, 80.4% of respondents said yes. Automation that keeps humans in the loop at key decision points increases flexibility and stability while automating repetitive tasks.

The top three tasks respondents would like automated are: service requests (52.6%), change requests (42.9%) and user provisioning (39.8%). Organizations are seeing the need to double-down on automation — the top three ways organizations plan to improve their incident management process are to implement new automation tools or applications (48.2%), implement new communications/collaboration tools or applications (41.5%) and implement new integration tools or applications (40.6%).

The survey makes it clear that ITOps, DevOps and SRE professionals should consider enhancing service reliability through human-in-the-loop automation, SRE practices and better collaboration methods. Teams enabled with these tools and process advancements are better empowered to spend their time and efforts on delivering innovation and competitive advantages, and ultimately creating more business value.

Jessica Abelson is Director of Product Marketing at Transposit
Share this

The Latest

August 05, 2022

MLOps or Machine Learning Operations are a combination of best processes and practices that businesses use to run AI successfully ... While it is a relatively new field, MLOps is a collective effort that captured the interest of data scientists, DevOps engineers, AI enthusiasts, and IT ...

August 04, 2022

The data is in: enterprises are not happy with their managed service providers (MSPs) and cloud service providers (CSPs). According to the latest CloudBolt Industry Insights report, Filling the Gap: Service Providers' Increasingly Important Role in Multi-Cloud Success, 80% are so unsatisfied with their existing MSP and/or CSP, they are actively looking to replace them within 12 months ...

August 03, 2022

The last two years have accelerated massive changes in how we work, do business, and engage with customers. According to Pega research, nearly three out of four employees (71%) feel their job complexity continues to rise as customer demands increase, and employees at all levels feel overloaded with information, systems, and processes that make it difficult to adapt to these new challenges and meet their customers' growing needs ...

August 02, 2022

Investing in employees will always be smart business. And right now, investing in employees means giving people the resources — and ability — to optimize performance ... For pretty much every company, that means delivering the digital tools necessary to facilitate seamless, secure, user-friendly access and connectivity ...

August 01, 2022

Digital transformation can be the difference between becoming the next Netflix and becoming the next Blockbuster Video. With corporate survival on the line, "digital transformation" is no longer merely an impressive buzzword to throw around in boardrooms. It's the ticket for entry into the digital era, a fundamental business strategy for every modern company ...

July 29, 2022

IT infrastructure has rapidly evolved over the last decade, and as a result important specialized tools have been developed and an entire dedicated industry has grown up to serve the need for monitoring these IT systems and services in order to keep them operational and efficient ...

July 28, 2022

At Cisco AppDynamics, we recently conducted research exploring consumer attitudes and behaviors in relation to wearable technology ... In our study, 87% of global consumers claimed that trust is a critical factor when choosing a wearable medical device or application brand. And, 86% expect companies offering wearable technology and applications to demonstrate a higher standard of protection for their personal data than any other technology they use ...

July 27, 2022

You've been here before: waiting for a web page to load. You keep refreshing it, but still no luck. How many times will you try to reload the page before visiting a different site? Probably not too many. Brands today have just a few moments at most to captivate and delight potential customers ...

July 26, 2022

In the DevOps world, observability is trumpeted and lauded in many corners. However, in reading much of the coverage, there seemed to be some more fundamental issues at play. It's time to demystify the idea of observability, shedding light on what it means in a broader context. And once we break down the concept and its true value to an organization, let's answer a more important question: Are we approaching an observability tipping point? ...

July 25, 2022
It is common knowledge that businesses across domains committed to digitizing themselves during the pandemic. While digital presence is the need of the hour, the challenge is elsewhere. Today, for businesses to sustain themselves in the market, performance has become the key to achieving success ...