Advanced Observability Teams See Big Efficiency Gains - Part 1
May 18, 2020

George Miranda
Honeycomb.io

Share this

As our production application systems continuously increase in complexity, the challenges of understanding, debugging, and improving them keep growing by orders of magnitude. The practice of Observability addresses both the social and the technological challenges of wrangling complexity and working toward achieving production excellence. New research shows how observable systems and practices are changing the application performance management (APM) landscape.

Observability Requires Both Technical and Social Approaches

Tooling alone can't solve anything, it's just a necessary part of any solution. Tackling the challenges of managing complex production systems isn't just a technical problem and it isn't just a social problem. We manage sociotechnical systems and any reasonable solution must take that into account in order to be effective.

Observability isn't logs, metrics, and tracing. Yes, those aspects are important. Those tools can help shed light on what's happening in the systems that are critical to your business. However, there's a big difference between having tools that provide instrumentation and using them to achieve better outcomes. Many of today's tools require you to predict the future by knowing in advance what conditions to monitor, which trends to look for, or the correlations you need to make to find application performance hotspots.

The coveted observability sweet spot is finding the unknown unknowns. Observability is a sociotechnical practice that allows you to answer any arbitrary questions about your environment, without needing to know ahead of time what you wanted to ask. However, it's doing the work that proves a bit more challenging for many teams, especially those weaning off legacy tools.

Practicing observability is a journey. It takes time for entire teams to adopt new practices and shift mindsets to a model of shared ownership. Our new study shows how different teams are practicing, or intending to practice, observability within the next two years. The report also examines the challenges teams face and the practices they are implementing as they progress on their observability journey.

Observability Maturity Research Findings

Teams must decide how to start their observability journey. Those early decisions have a high degree of impact because they influence both tool choices and habits during the software development and delivery lifecycle. Teams that adopt recommended observability practise to an advanced degree see greater benefits than less advanced teams. Advanced teams stabilize their systems, spend less time reactively fixing issues in production/refactoring code/resolving technical debt, and spend more time proactively innovating. 

The report affirms that adopting observability tools, site reliability engineering (SRE) practices, and a culture of shared ownership translates to efficiencies across the software engineering cycle, better end-user experiences, and ultimately helps teams achieve production excellence.

Outcomes are much more pronounced when teams apply observability mindsets and processes in conjunction with tooling. That combination can lead to a virtuous cycle of reinforcement, presuming those teams are using tools purposely designed to address observability use-cases. Research findings show that most teams adopt a handful of tools across disparate teams to accomplish daily tasks. Yet it's that same juggling of different tools that creates confusion, frustration, an oft-heard complaint of tool bloat, and ultimately leads to slower performance.

Go to Advanced Observability Teams See Big Efficiency Gains - Part 2

George Miranda is Product Marketing Director at Honeycomb.io
Share this

The Latest

October 05, 2022

IT operations is a metrics-driven function and teams should keep score as a core practice. Services and sub-services break, alerts of varying quality come in, incidents are created, and services get fixed. Analytics can help IT teams improve these operations ...

October 04, 2022

Big Data makes it possible to bring data from all the monitoring and reporting tools together, both for more effective analysis and a simplified single-pane view for the user. IT teams gain a holistic picture of system performance. Doing this makes sense because the system's components interact, and issues in one area affect another ...

October 03, 2022

IT engineers and executives are responsible for system reliability and availability. The volume of data can make it hard to be proactive and fix issues quickly. With over a decade of experience in the field, I know the importance of IT operations analytics and how it can help identify incidents and enable agile responses ...

September 30, 2022

For businesses with vast and distributed computing infrastructures, one of the main objectives of IT and network operations is to locate the cause of a service condition that is having an impact. The more human resources are put into the task of gathering, processing, and finally visual monitoring the massive volumes of event and log data that serve as the main source of symptomatic indications for emerging crises, the closer the service is to the company's source of revenue ...

September 29, 2022

Our digital economy is intolerant of downtime. But consumers haven't just come to expect always-on digital apps and services. They also expect continuous innovation, new functionality and lightening fast response times. Organizations have taken note, investing heavily in teams and tools that supposedly increase uptime and free resources for innovation. But leaders have not realized this "throw money at the problem" approach to monitoring is burning through resources without much improvement in availability outcomes ...

September 28, 2022

Although 83% of businesses are concerned about a recession in 2023, B2B tech marketers can look forward to growth — 51% of organizations plan to increase IT budgets in 2023 vs. a narrow 6% that plan to reduce their spend, according to the 2023 State of IT report from Spiceworks Ziff Davis ...

September 27, 2022

Users have high expectations around applications — quick loading times, look and feel visually advanced, with feature-rich content, video streaming, and multimedia capabilities — all of these devour network bandwidth. With millions of users accessing applications and mobile apps from multiple devices, most companies today generate seemingly unmanageable volumes of data and traffic on their networks ...

September 26, 2022

In Italy, it is customary to treat wine as part of the meal ... Too often, testing is treated with the same reverence as the post-meal task of loading the dishwasher, when it should be treated like an elegant wine pairing ...

September 23, 2022

In order to properly sort through all monitoring noise and identify true problems, their causes, and to prioritize them for response by the IT team, they have created and built a revolutionary new system using a meta-cognitive model ...

September 22, 2022

As we shift further into a digital-first world, where having a reliable online experience becomes more essential, Site Reliability Engineers remain in-demand among organizations of all sizes ... This diverse set of skills and values can be difficult to interview for. In this blog, we'll get you started with some example questions and processes to find your ideal SRE ...