Skip to main content

Remote Work and Digital Transformation Exacerbate Challenges of Managing the Modern Stack

Ed Sawma
Transposit

A growing need for process automation as a result of the confluence of digital transformation initiatives with the remote/hybrid work policies brought on by the pandemic was uncovered by an independent survey of over 500 IT Operations, DevOps, and Site Reliability Engineering (SRE) professionals commissioned by Transposit for its inaugural State of DevOps Automation Report.

More than half of respondents reported that the most common challenge while taking action to resolve an incident was a lack of automation. This influx of stressors means ITOps and software engineering teams — including DevOps and SREs — face increasing complexity in their work, leading to significantly more strain and application downtime unless preventive measures are taken.

Service Incidents and Remediation in a Pandemic-Influenced World

The vast majority of organizations surveyed adopted remote/hybrid work policies and augmented digital transformation initiatives since the start of the pandemic. At the same time many have also been hampered by longer incident resolution, inefficient processes, and lack of automation.

9 out of 10 organizations experienced an increase in service incidents that have affected their customers since the start of the pandemic

The acceleration in digital transformation has resulted in an uptick in service incidents, putting a heavier burden on DevOps, SRE, and IT teams. The survey found that 9 out of 10 organizations experienced an increase in service incidents that have affected their customers since the start of the pandemic, with nearly 60% of respondents observing at least a 20% increase in service incidents or more. Most (93%) said incidents were taking longer to resolve while working remotely and nearly 70% saw an increase in the cost of downtime since the pandemic began.

The survey results indicate these findings stem from a number of variables. First, most organizations still rely on manual, repetitive DevOps processes that cause unnecessary toil.

They're also investing precious resources on building custom in-house tools — which burdens all parts of the software stack — when those resources could instead be used on product innovation or customer service initiatives.

Still, organizations are motivated to get the right tools, processes, and reliable automation in place to keep pace with innovation and decrease mean time to resolution (MTTR). The majority of respondents believed that systematically mining insights from human data (such as archived Slack communications, postmortem interviews, group feedback, etc.) could improve both future incident response and fuel operational excellence.

The Growing Popularity of Site Reliability Engineering

SREs are essential to any organization for solving infrastructure and operational problems — and they're going mainstream. In fact, an overwhelming 94% of respondents increased focus on SRE practices in their organization in the past 12 months and 86% of organizations are planning to hire SREs in the next 12 months. While these numbers are high, they're not surprising when considering how engineering and operations teams are being stretched to the limit. Investments in automation are a natural reaction to these circumstances.

Even if organizations do not have formal SRE roles, ITOps teams are adopting SRE practices. Almost all (98%) of respondents with the "VP/Director/Manager IT Operations" role increased focus on SRE practices in their organization in the past 12 months, while 62.4% of IT Operations respondents plan to expand SRE efforts in 2021.

SREs are critical contributors to incident resolution and help teams work with complex distributed systems at scale. However, nearly 80% of respondents said individuals responsible for reliability engineering are experiencing challenges while trying to solve incidents as they are occurring.

Automation Drivers and Barriers

A key takeaway from the study is that automation is a highly valuable tool for engineering operations. Although the benefits of automation are known, nearly half of respondents reported that their engineering operations are only 26-50% automated. Half (51.9%) cited inadequate documentation of institutional knowledge and existing processes as a barrier, followed by lack of clarity about what to automate (47.3%) and the gaps in share of knowledge (43.8%).

While organizations are still draining resources, time, and money on manual tasks while responding to incidents, they're aware something needs to change. This is evidenced by the 40% of organizations who have one or more full time engineers working on custom in-house tools or bots for automating incident response.

Most commercially available automation solutions use the "automate everything" approach and do not incorporate human-in-the-loop automation, which helps explain this finding. And humans aren't going anywhere: the research revealed that 9 out of 10 respondents believe automation should let humans use their judgment at critical decision points to be more reliable and effective.

One simple yet effective beachhead for moving automation forward is documentation. The marriage of automated process documentation that keeps humans in the loop and availability of actionable data on how to operate systems during and in between incidents can improve (MTTR), enhance service reliability, streamline operations, and lower the cost of downtime.

Ed Sawma is VP of Marketing at Transposit

Hot Topics

The Latest

Significant improvements in operational resilience, more effective use of automation and faster time to market are driving optimism about IT spending in 2025, with a majority of leaders expecting their budgets to increase year-over-year, according to the 2025 State of Digital Operations Report from PagerDuty ...

Image
PagerDuty

Are they simply number crunchers confined to back-office support, or are they the strategic influencers shaping the future of your enterprise? The reality is that data analysts are far more the latter. In fact, 94% of analysts agree their role is pivotal to making high-level business decisions, proving that they are becoming indispensable partners in shaping strategy ...

Today's enterprises exist in rapidly growing, complex IT landscapes that can inadvertently create silos and lead to the accumulation of disparate tools. To successfully manage such growth, these organizations must realize the requisite shift in corporate culture and workflow management needed to build trust in new technologies. This is particularly true in cases where enterprises are turning to automation and autonomic IT to offload the burden from IT professionals. This interplay between technology and culture is crucial in guiding teams using AIOps and observability solutions to proactively manage operations and transition toward a machine-driven IT ecosystem ...

Gartner identified the top data and analytics (D&A) trends for 2025 that are driving the emergence of a wide range of challenges, including organizational and human issues ...

Traditional network monitoring, while valuable, often falls short in providing the context needed to truly understand network behavior. This is where observability shines. In this blog, we'll compare and contrast traditional network monitoring and observability — highlighting the benefits of this evolving approach ...

A recent Rocket Software and Foundry study found that just 28% of organizations fully leverage their mainframe data, a concerning statistic given its critical role in powering AI models, predictive analytics, and informed decision-making ...

What kind of ROI is your organization seeing on its technology investments? If your answer is "it's complicated," you're not alone. According to a recent study conducted by Apptio ... there is a disconnect between enterprise technology spending and organizations' ability to measure the results ...

In today’s data and AI driven world, enterprises across industries are utilizing AI to invent new business models, reimagine business and achieve efficiency in operations. However, enterprises may face challenges like flawed or biased AI decisions, sensitive data breaches and rising regulatory risks ...

In MEAN TIME TO INSIGHT Episode 12, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses purchasing new network observability solutions.... 

There's an image problem with mobile app security. While it's critical for highly regulated industries like financial services, it is often overlooked in others. This usually comes down to development priorities, which typically fall into three categories: user experience, app performance, and app security. When dealing with finite resources such as time, shifting priorities, and team skill sets, engineering teams often have to prioritize one over the others. Usually, security is the odd man out ...

Image
Guardsquare

Remote Work and Digital Transformation Exacerbate Challenges of Managing the Modern Stack

Ed Sawma
Transposit

A growing need for process automation as a result of the confluence of digital transformation initiatives with the remote/hybrid work policies brought on by the pandemic was uncovered by an independent survey of over 500 IT Operations, DevOps, and Site Reliability Engineering (SRE) professionals commissioned by Transposit for its inaugural State of DevOps Automation Report.

More than half of respondents reported that the most common challenge while taking action to resolve an incident was a lack of automation. This influx of stressors means ITOps and software engineering teams — including DevOps and SREs — face increasing complexity in their work, leading to significantly more strain and application downtime unless preventive measures are taken.

Service Incidents and Remediation in a Pandemic-Influenced World

The vast majority of organizations surveyed adopted remote/hybrid work policies and augmented digital transformation initiatives since the start of the pandemic. At the same time many have also been hampered by longer incident resolution, inefficient processes, and lack of automation.

9 out of 10 organizations experienced an increase in service incidents that have affected their customers since the start of the pandemic

The acceleration in digital transformation has resulted in an uptick in service incidents, putting a heavier burden on DevOps, SRE, and IT teams. The survey found that 9 out of 10 organizations experienced an increase in service incidents that have affected their customers since the start of the pandemic, with nearly 60% of respondents observing at least a 20% increase in service incidents or more. Most (93%) said incidents were taking longer to resolve while working remotely and nearly 70% saw an increase in the cost of downtime since the pandemic began.

The survey results indicate these findings stem from a number of variables. First, most organizations still rely on manual, repetitive DevOps processes that cause unnecessary toil.

They're also investing precious resources on building custom in-house tools — which burdens all parts of the software stack — when those resources could instead be used on product innovation or customer service initiatives.

Still, organizations are motivated to get the right tools, processes, and reliable automation in place to keep pace with innovation and decrease mean time to resolution (MTTR). The majority of respondents believed that systematically mining insights from human data (such as archived Slack communications, postmortem interviews, group feedback, etc.) could improve both future incident response and fuel operational excellence.

The Growing Popularity of Site Reliability Engineering

SREs are essential to any organization for solving infrastructure and operational problems — and they're going mainstream. In fact, an overwhelming 94% of respondents increased focus on SRE practices in their organization in the past 12 months and 86% of organizations are planning to hire SREs in the next 12 months. While these numbers are high, they're not surprising when considering how engineering and operations teams are being stretched to the limit. Investments in automation are a natural reaction to these circumstances.

Even if organizations do not have formal SRE roles, ITOps teams are adopting SRE practices. Almost all (98%) of respondents with the "VP/Director/Manager IT Operations" role increased focus on SRE practices in their organization in the past 12 months, while 62.4% of IT Operations respondents plan to expand SRE efforts in 2021.

SREs are critical contributors to incident resolution and help teams work with complex distributed systems at scale. However, nearly 80% of respondents said individuals responsible for reliability engineering are experiencing challenges while trying to solve incidents as they are occurring.

Automation Drivers and Barriers

A key takeaway from the study is that automation is a highly valuable tool for engineering operations. Although the benefits of automation are known, nearly half of respondents reported that their engineering operations are only 26-50% automated. Half (51.9%) cited inadequate documentation of institutional knowledge and existing processes as a barrier, followed by lack of clarity about what to automate (47.3%) and the gaps in share of knowledge (43.8%).

While organizations are still draining resources, time, and money on manual tasks while responding to incidents, they're aware something needs to change. This is evidenced by the 40% of organizations who have one or more full time engineers working on custom in-house tools or bots for automating incident response.

Most commercially available automation solutions use the "automate everything" approach and do not incorporate human-in-the-loop automation, which helps explain this finding. And humans aren't going anywhere: the research revealed that 9 out of 10 respondents believe automation should let humans use their judgment at critical decision points to be more reliable and effective.

One simple yet effective beachhead for moving automation forward is documentation. The marriage of automated process documentation that keeps humans in the loop and availability of actionable data on how to operate systems during and in between incidents can improve (MTTR), enhance service reliability, streamline operations, and lower the cost of downtime.

Ed Sawma is VP of Marketing at Transposit

Hot Topics

The Latest

Significant improvements in operational resilience, more effective use of automation and faster time to market are driving optimism about IT spending in 2025, with a majority of leaders expecting their budgets to increase year-over-year, according to the 2025 State of Digital Operations Report from PagerDuty ...

Image
PagerDuty

Are they simply number crunchers confined to back-office support, or are they the strategic influencers shaping the future of your enterprise? The reality is that data analysts are far more the latter. In fact, 94% of analysts agree their role is pivotal to making high-level business decisions, proving that they are becoming indispensable partners in shaping strategy ...

Today's enterprises exist in rapidly growing, complex IT landscapes that can inadvertently create silos and lead to the accumulation of disparate tools. To successfully manage such growth, these organizations must realize the requisite shift in corporate culture and workflow management needed to build trust in new technologies. This is particularly true in cases where enterprises are turning to automation and autonomic IT to offload the burden from IT professionals. This interplay between technology and culture is crucial in guiding teams using AIOps and observability solutions to proactively manage operations and transition toward a machine-driven IT ecosystem ...

Gartner identified the top data and analytics (D&A) trends for 2025 that are driving the emergence of a wide range of challenges, including organizational and human issues ...

Traditional network monitoring, while valuable, often falls short in providing the context needed to truly understand network behavior. This is where observability shines. In this blog, we'll compare and contrast traditional network monitoring and observability — highlighting the benefits of this evolving approach ...

A recent Rocket Software and Foundry study found that just 28% of organizations fully leverage their mainframe data, a concerning statistic given its critical role in powering AI models, predictive analytics, and informed decision-making ...

What kind of ROI is your organization seeing on its technology investments? If your answer is "it's complicated," you're not alone. According to a recent study conducted by Apptio ... there is a disconnect between enterprise technology spending and organizations' ability to measure the results ...

In today’s data and AI driven world, enterprises across industries are utilizing AI to invent new business models, reimagine business and achieve efficiency in operations. However, enterprises may face challenges like flawed or biased AI decisions, sensitive data breaches and rising regulatory risks ...

In MEAN TIME TO INSIGHT Episode 12, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses purchasing new network observability solutions.... 

There's an image problem with mobile app security. While it's critical for highly regulated industries like financial services, it is often overlooked in others. This usually comes down to development priorities, which typically fall into three categories: user experience, app performance, and app security. When dealing with finite resources such as time, shifting priorities, and team skill sets, engineering teams often have to prioritize one over the others. Usually, security is the odd man out ...

Image
Guardsquare