How Observability Helps Ingest and Normalize Data for DevOps Engineers
September 08, 2021

Richard Whitehead

Share this

Humans naturally love structure. Just take books, for example. We've been ingesting and normalizing data through bookmaking since ancient times. In bookmaking, we transport, or ingest, data (in the form of text and images) from the spoken word or author's imagination to a physical structure. Covers denote the information's beginning and end, and a table of contents and chapters categorize, or normalize, the data.

The same logic applies to modern computer data. Humans prefer information that is easy to understand, and we make sense of unstructured data — whether it's text or time series data — by ingesting and normalizing it.

DevOps, SRE and other operations teams use observability solutions with AIOps to ingest and normalize data to get visibility into tech stacks from a centralized system, reduce noise and understand the data's context for quicker mean time to recovery (MTTR). With AI using these processes to produce actionable insights, teams are free to spend more time innovating and providing superior service assurance.

Let's explore AI's role in ingestion and normalization, and then dive into correlation and deduplication too:

How Is Data Ingested into an Observability Platform?

Solutions that provide observability with AIOps are flexible, incorporating data from a broad range of sources. These monitoring systems ingest event management data, like alerts, log events and time series data. Modern observability solutions also notify teams about system changes, which is critical considering an environmental change instigates most system failures. In the end, any data source is fair game, as long as the data tells you something about your real-time operational environment.

The data source dictates how your monitoring tool ingests the information. The first, more preferred method is a continuous data stream. The alternative is a pull mechanism, like a Prometheus pattern, which scrapes data at regular intervals. In older applications, you may have to use a creative plug-in or adapter that converts information into an accessible format and enables teams to query an application or system for data.

So why move all of this data into an observability platform? Transporting information from multiple sources and putting it into a centralized system can reveal the big picture behind the data.

How Is Data Normalized?

Once data is coming into your observability platform, it's helpful to normalize the information according to its common features. AI can extract information from unstructured data and elevate it to a feature, like a source or timestamp. These features allow you to sort or query the data or, in more sophisticated environments, apply AI-based techniques such as natural language processing (NLP).

As you normalize data, it helps to understand the incoming format and structure. If you're going to map fields and break down the message into component parts, understand what part of the message is variable and what part is static.

You can use enrichment techniques if data doesn't have a required field, appropriate feature or required information. Enrichment skirts the lack of information by finding a key to cross-reference with an external data source.

How Does Observability with AIOps Reduce Toil?

When you have normalized data, you can use AI to detect problems quickly through correlation and deduplication. Imagine if your system fails and you have to dig through hundreds of logs to see how the environment changed. That's time-consuming, not to mention boring.

Correlate, or group, data based on common characteristics like service, class or description field. Time is also handy operational information and serves as a practical classifier. Let's go back to our system failure. If you just made an environmental change, understanding the time the alerts came in helps pinpoint the problem.

Correlation can also mimic human behavior, which is a challenge for most computer systems. For example, online checkout processes are complex, with many integrated, interdependent parts. An intelligent observability tool with AIOps can correlate data alerts related to a checkout process using NLP. If that's an issue, your observability platform will group all of the alerts associated with the stem word "check," which accommodates derivations and variations like "checking," "Check," and "check out."

Let's move on to the benefits of deduplicating normalizing data. You're working and, suddenly, a "CPU overloaded" alert pops up. You start fixing the issue, but another "CPU overloaded" alert hits your inbox. And it's followed by 30 more similar alerts. That's distracting and not particularly useful.

Deduplication reduces noise and minimizes incident volumes by eliminating excessive copies of the data. Instead of the monitoring system telling you that the CPU is overloaded 32 separate times, AI compresses repeated messages into one stateful message. Deduplication can seem trivial, especially compared to techniques like NLP, but the devil is in the details. Understanding when a message indicates a new issue, rather than just a repeated message, must be considered.

Intelligent observability with AIOps centralizes data and makes it easier for teams to understand. And when these systems detect incidents, AI-enabled correlation and deduplication minimize the impact of this unplanned work. The downstream effects on DevOps practitioners and SRE teams are significant. These teams can spend less time putting out fires and more time focusing their time and attention on keeping up with the constant demand to innovate and delight customers.

Richard Whitehead is Chief Evangelist at Moogsoft
Share this

The Latest

March 30, 2023

APMdigest and leading IT research firm Enterprise Management Associates (EMA) are partnering to bring you the EMA-APMdigest Podcast, a new podcast focused on the latest technologies impacting IT Operations. In Episode 2 - Part 2 Pete Goldin, Editor and Publisher of APMdigest, discusses Network Observability with Shamus McGillicuddy, Vice President of Research, Network Infrastructure and Operations, at EMA ...

March 29, 2023

Most organizations suffer from some form of alert noise. Alert noise is only going to increase as organizations support cloud-native applications spanning multiple public and private clouds, including ephemeral deployments and more. It's not going to get easier for organizations to understand the signal from all those alerts being sent. So what can be done about it? ...

March 28, 2023

This blog presents the case for a radical new approach to basic information technology (IT) education. This conclusion is based on a study of courses and other forms of IT education which purport to cover IT "fundamentals" ...

March 27, 2023

To achieve maximum availability, IT leaders must employ domain-agnostic solutions that identify and escalate issues across all telemetry points. These technologies, which we refer to as Artificial Intelligence for IT Operations, create convergence — in other words, they provide IT and DevOps teams with the full picture of event management and downtime ...

March 23, 2023

APMdigest and leading IT research firm Enterprise Management Associates (EMA) are partnering to bring you the EMA-APMdigest Podcast, a new podcast focused on the latest technologies impacting IT Operations. In Episode 2 - Part 1 Pete Goldin, Editor and Publisher of APMdigest, discusses Network Observability with Shamus McGillicuddy, Vice President of Research, Network Infrastructure and Operations, at EMA ...

March 22, 2023

CIOs have stepped into the role of digital leader and strategic advisor, according to the 2023 Global CIO Survey from Logicalis ...

March 21, 2023

Synthetic monitoring is crucial to deploy code with confidence as catching bugs with E2E tests on staging is becoming increasingly difficult. It isn't trivial to provide realistic staging systems, especially because today's apps are intertwined with many third-party APIs ...

March 20, 2023

Recent EMA field research found that ServiceOps is either an active effort or a formal initiative in 78% of the organizations represented by a global panel of 400+ IT leaders. It is relatively early but gaining momentum across industries and organizations of all sizes globally ...

March 16, 2023

Managing availability and performance within SAP environments has long been a challenge for IT teams. But as IT environments grow more complex and dynamic, and the speed of innovation in almost every industry continues to accelerate, this situation is becoming a whole lot worse ...

March 15, 2023

Harnessing the power of network-derived intelligence and insights is critical in detecting today's increasingly sophisticated security threats across hybrid and multi-cloud infrastructure, according to a new research study from IDC ...