How Observability Helps Ingest and Normalize Data for DevOps Engineers
September 08, 2021

Richard Whitehead
Moogsoft

Share this

Humans naturally love structure. Just take books, for example. We've been ingesting and normalizing data through bookmaking since ancient times. In bookmaking, we transport, or ingest, data (in the form of text and images) from the spoken word or author's imagination to a physical structure. Covers denote the information's beginning and end, and a table of contents and chapters categorize, or normalize, the data.

The same logic applies to modern computer data. Humans prefer information that is easy to understand, and we make sense of unstructured data — whether it's text or time series data — by ingesting and normalizing it.

DevOps, SRE and other operations teams use observability solutions with AIOps to ingest and normalize data to get visibility into tech stacks from a centralized system, reduce noise and understand the data's context for quicker mean time to recovery (MTTR). With AI using these processes to produce actionable insights, teams are free to spend more time innovating and providing superior service assurance.

Let's explore AI's role in ingestion and normalization, and then dive into correlation and deduplication too:

How Is Data Ingested into an Observability Platform?

Solutions that provide observability with AIOps are flexible, incorporating data from a broad range of sources. These monitoring systems ingest event management data, like alerts, log events and time series data. Modern observability solutions also notify teams about system changes, which is critical considering an environmental change instigates most system failures. In the end, any data source is fair game, as long as the data tells you something about your real-time operational environment.

The data source dictates how your monitoring tool ingests the information. The first, more preferred method is a continuous data stream. The alternative is a pull mechanism, like a Prometheus pattern, which scrapes data at regular intervals. In older applications, you may have to use a creative plug-in or adapter that converts information into an accessible format and enables teams to query an application or system for data.

So why move all of this data into an observability platform? Transporting information from multiple sources and putting it into a centralized system can reveal the big picture behind the data.

How Is Data Normalized?

Once data is coming into your observability platform, it's helpful to normalize the information according to its common features. AI can extract information from unstructured data and elevate it to a feature, like a source or timestamp. These features allow you to sort or query the data or, in more sophisticated environments, apply AI-based techniques such as natural language processing (NLP).

As you normalize data, it helps to understand the incoming format and structure. If you're going to map fields and break down the message into component parts, understand what part of the message is variable and what part is static.

You can use enrichment techniques if data doesn't have a required field, appropriate feature or required information. Enrichment skirts the lack of information by finding a key to cross-reference with an external data source.

How Does Observability with AIOps Reduce Toil?

When you have normalized data, you can use AI to detect problems quickly through correlation and deduplication. Imagine if your system fails and you have to dig through hundreds of logs to see how the environment changed. That's time-consuming, not to mention boring.

Correlate, or group, data based on common characteristics like service, class or description field. Time is also handy operational information and serves as a practical classifier. Let's go back to our system failure. If you just made an environmental change, understanding the time the alerts came in helps pinpoint the problem.

Correlation can also mimic human behavior, which is a challenge for most computer systems. For example, online checkout processes are complex, with many integrated, interdependent parts. An intelligent observability tool with AIOps can correlate data alerts related to a checkout process using NLP. If that's an issue, your observability platform will group all of the alerts associated with the stem word "check," which accommodates derivations and variations like "checking," "Check," and "check out."

Let's move on to the benefits of deduplicating normalizing data. You're working and, suddenly, a "CPU overloaded" alert pops up. You start fixing the issue, but another "CPU overloaded" alert hits your inbox. And it's followed by 30 more similar alerts. That's distracting and not particularly useful.

Deduplication reduces noise and minimizes incident volumes by eliminating excessive copies of the data. Instead of the monitoring system telling you that the CPU is overloaded 32 separate times, AI compresses repeated messages into one stateful message. Deduplication can seem trivial, especially compared to techniques like NLP, but the devil is in the details. Understanding when a message indicates a new issue, rather than just a repeated message, must be considered.

Intelligent observability with AIOps centralizes data and makes it easier for teams to understand. And when these systems detect incidents, AI-enabled correlation and deduplication minimize the impact of this unplanned work. The downstream effects on DevOps practitioners and SRE teams are significant. These teams can spend less time putting out fires and more time focusing their time and attention on keeping up with the constant demand to innovate and delight customers.

Richard Whitehead is Chief Evangelist at Moogsoft
Share this

The Latest

September 25, 2023

A long-running study of DevOps practices ... suggests that any historical gains in MTTR reduction have now plateaued. For years now, the time it takes to restore services has stayed about the same: less than a day for high performers but up to a week for middle-tier teams and up to a month for laggards. The fact that progress is flat despite big investments in people, tools and automation is a cause for concern ...

September 21, 2023

Companies implementing observability benefit from increased operational efficiency, faster innovation, and better business outcomes overall, according to 2023 IT Trends Report: Lessons From Observability Leaders, a report from SolarWinds ...

September 20, 2023

IT leaders are driving an increasing number of automation initiatives as a way to stay competitive, reduce costs and scale as they navigate an unpredictable social and economic environment, according to the 2023 State of Automation in IT survey conducted by Jitterbit ...

September 19, 2023

Customer loyalty is changing as retailers get increasingly competitive. More than 75% of consumers say they would end business with a company after a single bad customer experience. This means that just one price discrepancy, inventory mishap or checkout issue in a physical or digital store, could have customers running out to the next store that can provide them with better service. Retailers must be able to predict business outages in advance, and act proactively before an incident occurs, impacting customer experience ...

September 18, 2023
Digital transformation is key to ensuring companies keep up with the competitive market landscape. Putting digital at the core of a business can significantly reduce operating expenses and inefficiencies. However, this process often means changing the way internal teams work with one another. To help with the transition, this blog offers chief experience officers (CXOs) advice on how to lead a successful digital transformation project ...
September 14, 2023

Earlier this year, New Relic conducted a study on observability ... The 2023 Observability Forecast reveals observability's impact on the lives of technical professionals and businesses' bottom lines. Here are 10 key takeaways from the forecast ...

September 13, 2023
On September 10, MGM Resorts experienced what it called a "cybersecurity issue" that had a major impact on the company's systems, showing how cyberattacks can bring down applications, ultimately causing problems for a company in many ways ...
September 12, 2023

Only 33% of executives are "very confident" in their ability to operate in a public cloud environment, according to the 2023 State of CloudOps report from NetApp. This represents an increase from 2022 when only 21% reported feeling very confident ...

September 11, 2023

The majority of organizations across Australia and New Zealand (A/NZ) breached over the last year had personally identifiable information (PII) compromised, but most have not yet modified their data management policies, according to the Cybersecurity and PII Report from ManageEngine ...

September 07, 2023

A large majority of organizations employ more than one cloud automation solution, and this practice creates significant challenges that are resulting in delays and added costs for businesses, according to Why companies lose efficiency and compliance with cloud automation solutions from Broadcom ...