Skip to main content

Site Reliability Engineering (SRE) is the Force Multiplier of Digital Experiences

Colin Fallwell
Sumo Logic

The pandemic spurred a wave of digital services because they allowed companies to stay competitive in the digital transformation. This trend, in turn, caused companies to adopt site reliability engineering (SRE) to keep up with the customer demand for digital experiences.

DevOps Institute recently published the Global SRE Pulse 2022 highlighting the growing adoption of SRE as a central operating model to deliver digital services and applications.


Even with over 62% of respondents saying their organizations are leveraging SRE within their company today, the survey shows that many organizations are at different stages within SRE adoption. Only 1% of respondents report that they tried SRE but that it did not work for their company.

SRE is now an essential engineering practice for enterprises seeking to accelerate digital transformations to digital-first brands. So how can companies empower SREs and adopt the model across their entire IT organizations to improve digital experiences and ultimately the business? It first starts with addressing the workforce gap and then breaking down team silos.

Closing the Skills Gap

The biggest challenge when adopting SRE is finding those with the right skills to make SRE to work properly — with 85% of respondents citing the lack of staff with necessary skills as their biggest challenge.

Leaders can address skill gaps by training talent and promoting within the organization. It's important to not only look at the technical skills but also at a candidate's ability to see and advocate for the relationship between engineering and business.

It's also essential to implement automation solutions to reduce the manual work of solving priority alerts. It's not just a matter of implementing technology though. Teams must also update processes to ensure the technology is used by everyone, including those who resist AIOps and automation.

The survey found that some teams are implementing intelligent automation everywhere to ensure the reliability and continuous operation of systems. Specifically, 29% of respondents said they are currently leveraging observability tools and techniques.

One method of advancing automation is through chaos engineering and intentionally destroying and rebuilding environments to improve both hygiene and confidence. However, 43% of survey respondents said they're not applying chaos engineering at all, so there is significant opportunity for those willing to learn the skills.

SRE Best Practices Can Unify Teams

Siloed teams is another common challenge for organizations. Communication and dependencies delay responses and innovation. SREs can bridge the gap between IT and developers if leaders first implement these SRE best practices across teams.

Track and manage toil. Toil is work that is manual, repetitive, automatable, tactical, or devoid of enduring value, and it scales linearly as a service grows. In the survey, 66% of respondents said they measure toil in some or several teams, and 11% indicated they track toil everywhere. By measuring toil, SREs can proactively reduce its effects across teams to improve reliability.

Provide ongoing support. Organizations also report implementing SRE best practices, including these across all teams:

- Adopting observability and monitoring tools (29%)
- Supporting essential job certifications (27%)
- Practicing a no blame philosophy (36%)

The two most widely adopted practices to at least some extent were practicing no blame (92%) and retrospectives or post-mortems (95%). The philosophy of learning from failure is what drives SRE success in many organizations.

Looking into the Future of SRE

Overall, the level of maturity revealed by the Global SRE Pulse survey indicates that many organizations are invested in improving SRE and making it part of their processes and cultures.

With 37% of organizations reporting that they have centralized SRE teams, it appears the practices and topologies are evolving. But the foundation for SRE is on solid ground and business leaders can expect SRE to remain a fixture in the industry. Beyond that, SRE also has the opportunity to be a unifying force between IT and business departments. By partnering with business and development teams, SRE will have the ability to influence and improve business outcomes.

Colin Fallwell is Field CTO of Sumo Logic

Hot Topics

The Latest

Payment system failures are putting $44.4 billion in US retail and hospitality sales at risk each year, underscoring how quickly disruption can derail day-to-day trading, according to research conducted by Dynatrace ... The findings show that payment failures are no longer isolated incidents, but part of a recurring operational challenge that disrupts service, damages customer trust, and negatively impacts revenue ...

For years, the success of DevOps has been measured by how much manual work teams can automate ... I believe that in 2026, the definition of DevOps success is going to expand significantly. The era of automation is giving way to the era of intelligent delivery, in which AI doesn't just accelerate pipelines, it understands them. With open observability connecting signals end-to-end across those tools, teams can build closed-loop systems that don't just move faster, but learn, adapt, and take action autonomously with confidence ...

The conversation around AI in the enterprise has officially shifted from "if" to "how fast." But according to the State of Network Operations 2026 report from Broadcom, most organizations are unknowingly building their AI strategies on sand. The data is clear: CIOs and network teams are putting the cart before the horse. AI cannot improve what the network cannot see, predict issues without historical context, automate processes that aren't standardized, or recommend fixes when the underlying telemetry is incomplete. If AI is the brain, then network observability is the nervous system that makes intelligent action possible ...

SolarWinds data shows that one in three DBAs are contemplating leaving their positions — a striking indicator of workforce pressure in this role. This is likely due to the technical and interpersonal frustrations plaguing today's DBAs. Hybrid IT environments provide widespread organizational benefits but also present growing complexity. Simultaneously, AI presents a paradox of benefits and pain points ...

Over the last year, we've seen enterprises stop treating AI as “special projects.” It is no longer confined to pilots or side experiments. AI is now embedded in production, shaping decisions, powering new business models, and changing how employees and customers experience work every day. So, the debate of "should we adopt AI" is settled. The real question is how quickly and how deeply it can be applied ...

In MEAN TIME TO INSIGHT Episode 20, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA presents his 2026 NetOps predictions ... 

Today, technology buyers don't suffer from a lack of information but an abundance of it. They need a trusted partner to help them navigate this information environment ...

My latest title for O'Reilly, The Rise of Logical Data Management, was an eye-opener for me. I'd never heard of "logical data management," even though it's been around for several years, but it makes some extraordinary promises, like the ability to manage data without having to first move it into a consolidated repository, which changes everything. Now, with the demands of AI and other modern use cases, logical data management is on the rise, so it's "new" to many. Here, I'd like to introduce you to it and explain how it works ...

APMdigest's Predictions Series continues with 2026 Data Center Predictions — industry experts offer predictions on how data centers will evolve and impact business in 2026 ...

APMdigest's Predictions Series continues with 2026 DataOps Predictions — industry experts offer predictions on how DataOps and related technologies will evolve and impact business in 2026. Part 2 covers data and data platforms ...

Site Reliability Engineering (SRE) is the Force Multiplier of Digital Experiences

Colin Fallwell
Sumo Logic

The pandemic spurred a wave of digital services because they allowed companies to stay competitive in the digital transformation. This trend, in turn, caused companies to adopt site reliability engineering (SRE) to keep up with the customer demand for digital experiences.

DevOps Institute recently published the Global SRE Pulse 2022 highlighting the growing adoption of SRE as a central operating model to deliver digital services and applications.


Even with over 62% of respondents saying their organizations are leveraging SRE within their company today, the survey shows that many organizations are at different stages within SRE adoption. Only 1% of respondents report that they tried SRE but that it did not work for their company.

SRE is now an essential engineering practice for enterprises seeking to accelerate digital transformations to digital-first brands. So how can companies empower SREs and adopt the model across their entire IT organizations to improve digital experiences and ultimately the business? It first starts with addressing the workforce gap and then breaking down team silos.

Closing the Skills Gap

The biggest challenge when adopting SRE is finding those with the right skills to make SRE to work properly — with 85% of respondents citing the lack of staff with necessary skills as their biggest challenge.

Leaders can address skill gaps by training talent and promoting within the organization. It's important to not only look at the technical skills but also at a candidate's ability to see and advocate for the relationship between engineering and business.

It's also essential to implement automation solutions to reduce the manual work of solving priority alerts. It's not just a matter of implementing technology though. Teams must also update processes to ensure the technology is used by everyone, including those who resist AIOps and automation.

The survey found that some teams are implementing intelligent automation everywhere to ensure the reliability and continuous operation of systems. Specifically, 29% of respondents said they are currently leveraging observability tools and techniques.

One method of advancing automation is through chaos engineering and intentionally destroying and rebuilding environments to improve both hygiene and confidence. However, 43% of survey respondents said they're not applying chaos engineering at all, so there is significant opportunity for those willing to learn the skills.

SRE Best Practices Can Unify Teams

Siloed teams is another common challenge for organizations. Communication and dependencies delay responses and innovation. SREs can bridge the gap between IT and developers if leaders first implement these SRE best practices across teams.

Track and manage toil. Toil is work that is manual, repetitive, automatable, tactical, or devoid of enduring value, and it scales linearly as a service grows. In the survey, 66% of respondents said they measure toil in some or several teams, and 11% indicated they track toil everywhere. By measuring toil, SREs can proactively reduce its effects across teams to improve reliability.

Provide ongoing support. Organizations also report implementing SRE best practices, including these across all teams:

- Adopting observability and monitoring tools (29%)
- Supporting essential job certifications (27%)
- Practicing a no blame philosophy (36%)

The two most widely adopted practices to at least some extent were practicing no blame (92%) and retrospectives or post-mortems (95%). The philosophy of learning from failure is what drives SRE success in many organizations.

Looking into the Future of SRE

Overall, the level of maturity revealed by the Global SRE Pulse survey indicates that many organizations are invested in improving SRE and making it part of their processes and cultures.

With 37% of organizations reporting that they have centralized SRE teams, it appears the practices and topologies are evolving. But the foundation for SRE is on solid ground and business leaders can expect SRE to remain a fixture in the industry. Beyond that, SRE also has the opportunity to be a unifying force between IT and business departments. By partnering with business and development teams, SRE will have the ability to influence and improve business outcomes.

Colin Fallwell is Field CTO of Sumo Logic

Hot Topics

The Latest

Payment system failures are putting $44.4 billion in US retail and hospitality sales at risk each year, underscoring how quickly disruption can derail day-to-day trading, according to research conducted by Dynatrace ... The findings show that payment failures are no longer isolated incidents, but part of a recurring operational challenge that disrupts service, damages customer trust, and negatively impacts revenue ...

For years, the success of DevOps has been measured by how much manual work teams can automate ... I believe that in 2026, the definition of DevOps success is going to expand significantly. The era of automation is giving way to the era of intelligent delivery, in which AI doesn't just accelerate pipelines, it understands them. With open observability connecting signals end-to-end across those tools, teams can build closed-loop systems that don't just move faster, but learn, adapt, and take action autonomously with confidence ...

The conversation around AI in the enterprise has officially shifted from "if" to "how fast." But according to the State of Network Operations 2026 report from Broadcom, most organizations are unknowingly building their AI strategies on sand. The data is clear: CIOs and network teams are putting the cart before the horse. AI cannot improve what the network cannot see, predict issues without historical context, automate processes that aren't standardized, or recommend fixes when the underlying telemetry is incomplete. If AI is the brain, then network observability is the nervous system that makes intelligent action possible ...

SolarWinds data shows that one in three DBAs are contemplating leaving their positions — a striking indicator of workforce pressure in this role. This is likely due to the technical and interpersonal frustrations plaguing today's DBAs. Hybrid IT environments provide widespread organizational benefits but also present growing complexity. Simultaneously, AI presents a paradox of benefits and pain points ...

Over the last year, we've seen enterprises stop treating AI as “special projects.” It is no longer confined to pilots or side experiments. AI is now embedded in production, shaping decisions, powering new business models, and changing how employees and customers experience work every day. So, the debate of "should we adopt AI" is settled. The real question is how quickly and how deeply it can be applied ...

In MEAN TIME TO INSIGHT Episode 20, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA presents his 2026 NetOps predictions ... 

Today, technology buyers don't suffer from a lack of information but an abundance of it. They need a trusted partner to help them navigate this information environment ...

My latest title for O'Reilly, The Rise of Logical Data Management, was an eye-opener for me. I'd never heard of "logical data management," even though it's been around for several years, but it makes some extraordinary promises, like the ability to manage data without having to first move it into a consolidated repository, which changes everything. Now, with the demands of AI and other modern use cases, logical data management is on the rise, so it's "new" to many. Here, I'd like to introduce you to it and explain how it works ...

APMdigest's Predictions Series continues with 2026 Data Center Predictions — industry experts offer predictions on how data centers will evolve and impact business in 2026 ...

APMdigest's Predictions Series continues with 2026 DataOps Predictions — industry experts offer predictions on how DataOps and related technologies will evolve and impact business in 2026. Part 2 covers data and data platforms ...