Sumo Logic Reliability Management Released
September 13, 2022
Share this

Sumo Logic nnounced the general availability of Sumo Logic Reliability Management.

Reliability Management enables developers, SREs, and DevOps teams to manage the reliability of their mission-critical apps by adopting a Service Level Objective (SLOs) methodology.

Reliability Management is a new capability of Sumo Logic Observability that helps organizations adopt a fundamentally better approach to measure and improve the reliability of distributed applications. This approach focuses on the reliability of the application from the end user perspective instead of monitoring and alerting across all of the infrastructure, services and application components involved in the response.

“Sumo Logic Reliability Management shifts the focus on reliability from underlying technology components towards the user experience. It enables connecting business concerns with technology behavior. By facilitating alignment between app owners and developers, Reliability Management essentially serves as the master plan required to manage at the business level versus the signal level,” said Erez Barak, VP of Product Development for Observability, Sumo Logic. “Organizations can now find the optimal cadence to balance innovation velocity with service reliability and turn SLO metrics into actionable insights to achieve their performance promises to customers.”

“Some customers are using SLOs to support the journey to the cloud to understand the risk of their transformation better. Now, with this level of detail from Sumo Logic, organizations can not only see where they are stressing infrastructure - they can see the forest for the trees,” said Torsten Volk, Managing Research Director, EMA Research. “You can stop caring about a Kubernetes node misbehaving if it has no impact on your users. You can look at your system outside in. Teams can now focus on what matters and avoid burnout. This is a much-needed piece to gain true observability.”

Sumo Logic Reliability Management also adopted slogen. slogen is built on the OpenSLO standard and uses automation to minimize the effort needed to measure and set SLOs. Sumo Logic customers get the choice of defining SLOs either in Open Source specification or in Sumo Logic.

“We are not reinventing the standard. We are getting behind the OpenSLO standard and have continued our investment in developing this standard for the community,” continued Barak. “With an open solution for service level management, we’re enabling organizations to future-proof their SLOs.”

Sumo Logic Reliability Management helps organizations to be more proactive in delivering digital services by:

- Empowering leaders to balance innovation with service reliability: Sumo Logic Reliability Management provides real-time reliability and performance metrics to power data-driven decision-making. Leaders also gain proactive alerts on SLOs and error budget consumption.

- Delivering a simple, open, and secure approach to service level management: Built on the OpenSLO standard, Reliability Management enables teams to monitor SLIs based on existing Sumo Logic queries. When combined with Terraform support, SREs can effectively manage service levels as code in a versioned and repeatable manner across any number of product and service teams.

- Making good SRE practices a reality: Enables SRE teams to uniformly adopt concepts such as SLIs, SLOs, SLAs and error budgets, and apply them to the business problem of reliability management. By automating data collection and analysis, teams also get a consistent view of SLOs and reliability across various products or services.

Sumo Logic Observability helps reduce downtime and solves customer-impacting issues faster with full-stack observability for all application data including logs, metrics, events and traces across the entire development lifecycle.

Share this

The Latest

October 03, 2022

IT engineers and executives are responsible for system reliability and availability. The volume of data can make it hard to be proactive and fix issues quickly. With over a decade of experience in the field, I know the importance of IT operations analytics and how it can help identify incidents and enable agile responses ...

September 30, 2022

For businesses with vast and distributed computing infrastructures, one of the main objectives of IT and network operations is to locate the cause of a service condition that is having an impact. The more human resources are put into the task of gathering, processing, and finally visual monitoring the massive volumes of event and log data that serve as the main source of symptomatic indications for emerging crises, the closer the service is to the company's source of revenue ...

September 29, 2022

Our digital economy is intolerant of downtime. But consumers haven't just come to expect always-on digital apps and services. They also expect continuous innovation, new functionality and lightening fast response times. Organizations have taken note, investing heavily in teams and tools that supposedly increase uptime and free resources for innovation. But leaders have not realized this "throw money at the problem" approach to monitoring is burning through resources without much improvement in availability outcomes ...

September 28, 2022

Although 83% of businesses are concerned about a recession in 2023, B2B tech marketers can look forward to growth — 51% of organizations plan to increase IT budgets in 2023 vs. a narrow 6% that plan to reduce their spend, according to the 2023 State of IT report from Spiceworks Ziff Davis ...

September 27, 2022

Users have high expectations around applications — quick loading times, look and feel visually advanced, with feature-rich content, video streaming, and multimedia capabilities — all of these devour network bandwidth. With millions of users accessing applications and mobile apps from multiple devices, most companies today generate seemingly unmanageable volumes of data and traffic on their networks ...

September 26, 2022

In Italy, it is customary to treat wine as part of the meal ... Too often, testing is treated with the same reverence as the post-meal task of loading the dishwasher, when it should be treated like an elegant wine pairing ...

September 23, 2022

In order to properly sort through all monitoring noise and identify true problems, their causes, and to prioritize them for response by the IT team, they have created and built a revolutionary new system using a meta-cognitive model ...

September 22, 2022

As we shift further into a digital-first world, where having a reliable online experience becomes more essential, Site Reliability Engineers remain in-demand among organizations of all sizes ... This diverse set of skills and values can be difficult to interview for. In this blog, we'll get you started with some example questions and processes to find your ideal SRE ...

September 21, 2022

US government agencies are bringing more of their employees back into the office and implementing hybrid work schedules, but federal workers are worried that their agencies' IT architectures aren't built to handle the "new normal." They fear that the reactive, manual methods used by the current systems in dealing with user, IT architecture and application problems will degrade the user experience and negatively affect productivity. In fact, according to a recent survey, many federal employees are concerned that they won't work as effectively back in the office as they did at home ...

September 20, 2022

Users today expect a seamless, uninterrupted experience when interacting with their web and mobile apps. Their expectations have continued to grow in tandem with their appetite for new features and consistent updates. Mobile apps have responded by increasing their release cadence by up to 40%, releasing a new full version of their app every 4-5 days, as determined in this year's SmartBear State of Software Quality | Application Stability Index report ...