SRE is quickly becoming the standard for IT transformation journeys. According to DevOps Institute's Global SRE Pulse 2022, 62% of organizations across the globe are adopting SRE in various ways — (55% in specific teams, services or products; 19% across the entire organization; and, 23% as a pilot).
For organizations to be successful with SRE, they must also transform the culture and human side within their organization. This cultural shift and new way of thinking must happen across IT and the business. The Global SRE Pulse 2022 report offers a deep look into the state and trends that are shaping SRE now and looking forward. With more than 460 survey responses from SRE professionals at organizations of all sizes, we've identified four top takeaways from the Global SRE Pulse report:
SRE is an Essential Engineering Function for Digital Transformation
SRE enhances development and operations collaboration. The outcome is more reliable systems, services and/or applications. This leads to an improvement of business value of services and applications created and improves the relationship between IT and the business. As existing software stacks only get more complex, organizations look to SRE to establish better collaboration between development and operations teams, while continuously improving the reliability and health of applications and services, ultimately optimizing their customer experience.
The survey asked where SRE is currently leveraged: the software the company builds or the set of services SRE teams interact with. Fifty-six percent (56%) of respondents said they leverage SRE for operating their Systems of Engagement (SOE) and 42% for their Systems of Record (SOR). Of the survey respondents, 52% who have adopted SRE described their company as a leader across customer experience, product quality, offerings, processes, services, and innovation.
What this means:
SRE is an essential operating model to improve both back-end (SOR) and front-end (SOE) services and applications, aiding organizations in accelerating their digital transformation. Other models such as DevOps are supported through the SRE engineering function as it eliminates toil and improves automation across a variety of essential processes such as incident, change, configuration and capacity management.
SRE Adoption Includes Challenges and Complexities
According to Global SRE Pulse 2022, a lack of staff with the necessary skill set essential to work as an SRE is the biggest challenge that organizations face, no matter what size. Eighty-five percent (85%) of survey respondents said they lack staff with the necessary skills to work as an SRE when implementing it.
Survey respondents also noted other challenges such as "value of SRE is not understood" (71%); "don't have time to implement SRE" (53%); "lack of tools in place" (55%); and, "lack of management support" (44%).
Process issues and new releases create the biggest sources of toil, according to the study. For SRE members, eliminating toil across different processes is a key point of focus. According to the survey, 27% of respondents cite process issues as the top source of toil. Another 19% cites the releases of new applications as the main source of toil. For a digital business, the revenue is directly tied to the value the software provides.
What this means:
The SRE operating model, and its critical success factors, must be presented showing benefits and routes to value so that the necessary adjustments within the organization around upskilling or reskilling can be made. Once the practice is established, measurements of its success through key performance indicators (KPI) such as improved adherence to Service Level Agreements (SLA), improvements around Service Level Objectives (SLO) and other KPIs can accelerate its adoption.
Observability and Monitoring Platforms Are in Demand
The second most adopted automation tools are observability and monitoring platforms. Seventy-two percent (72%) of survey respondents indicated they are currently implementing and continuously implementing observability and monitoring tools. This is a good indication that observability and monitoring strategies are starting to bear fruits.
However, for effective observability, organizations must adopt it everywhere and today, only 29% of survey respondents report the leverage of observability tools and techniques everywhere.
What this means:
While many organizations still leverage fragmented monitoring approaches across their organization, it results in limited insights into the performance of modern hybrid cloud applications and other business-critical resources. This fragmented approach challenges progress in the digital transformation. Observability should be adopted holistically to infer the outputs through observing the internal states of a system.
The SRE Job Market is HOT
Site Reliability Engineering continues to be a top IT job. More than 50% of respondents said they had expanded their skills and capabilities in the SRE role.
Further, 44% strongly agreed that they are more engaged and excited about their SRE role and 36% strongly agree that they are more valued as a team member.
34% feel more valued and appreciated.
Additionally, respondents revealed that the SRE role tends to get higher pay. Fifty-two percent (52%) of respondents revealed that they agree (strongly, or somewhat) that their compensation has improved.
What this means:
Today, SRE is an essential engineering function providing great fulfillment, pay and opportunities to learn. As organizations adopt it more widely across their organizations, there is a need — and opportunity — for more skilled SRE professionals to help improve processes and establish a more collaborative culture across the organization.
IT engineers and executives are responsible for system reliability and availability. The volume of data can make it hard to be proactive and fix issues quickly. With over a decade of experience in the field, I know the importance of IT operations analytics and how it can help identify incidents and enable agile responses ...
For businesses with vast and distributed computing infrastructures, one of the main objectives of IT and network operations is to locate the cause of a service condition that is having an impact. The more human resources are put into the task of gathering, processing, and finally visual monitoring the massive volumes of event and log data that serve as the main source of symptomatic indications for emerging crises, the closer the service is to the company's source of revenue ...
Our digital economy is intolerant of downtime. But consumers haven't just come to expect always-on digital apps and services. They also expect continuous innovation, new functionality and lightening fast response times. Organizations have taken note, investing heavily in teams and tools that supposedly increase uptime and free resources for innovation. But leaders have not realized this "throw money at the problem" approach to monitoring is burning through resources without much improvement in availability outcomes ...
Although 83% of businesses are concerned about a recession in 2023, B2B tech marketers can look forward to growth — 51% of organizations plan to increase IT budgets in 2023 vs. a narrow 6% that plan to reduce their spend, according to the 2023 State of IT report from Spiceworks Ziff Davis ...
Users have high expectations around applications — quick loading times, look and feel visually advanced, with feature-rich content, video streaming, and multimedia capabilities — all of these devour network bandwidth. With millions of users accessing applications and mobile apps from multiple devices, most companies today generate seemingly unmanageable volumes of data and traffic on their networks ...
In Italy, it is customary to treat wine as part of the meal ... Too often, testing is treated with the same reverence as the post-meal task of loading the dishwasher, when it should be treated like an elegant wine pairing ...
In order to properly sort through all monitoring noise and identify true problems, their causes, and to prioritize them for response by the IT team, they have created and built a revolutionary new system using a meta-cognitive model ...
As we shift further into a digital-first world, where having a reliable online experience becomes more essential, Site Reliability Engineers remain in-demand among organizations of all sizes ... This diverse set of skills and values can be difficult to interview for. In this blog, we'll get you started with some example questions and processes to find your ideal SRE ...
US government agencies are bringing more of their employees back into the office and implementing hybrid work schedules, but federal workers are worried that their agencies' IT architectures aren't built to handle the "new normal." They fear that the reactive, manual methods used by the current systems in dealing with user, IT architecture and application problems will degrade the user experience and negatively affect productivity. In fact, according to a recent survey, many federal employees are concerned that they won't work as effectively back in the office as they did at home ...
Users today expect a seamless, uninterrupted experience when interacting with their web and mobile apps. Their expectations have continued to grow in tandem with their appetite for new features and consistent updates. Mobile apps have responded by increasing their release cadence by up to 40%, releasing a new full version of their app every 4-5 days, as determined in this year's SmartBear State of Software Quality | Application Stability Index report ...