Skip to main content

How AI Can Turbocharge Your Observability Practice

Mimi Shalash
Splunk

AI has transformed technologies, workflows and entire industries, reshaping how people scale performance analysis. Organizations are seeing that AI has the potential to dramatically strengthen innovation and employee productivity by automating manual tasks and quickly extracting valuable insights. This rapid enterprise adoption is showing no signs of stopping with global AI tool users expected to reach 729 million by 2030, in comparison to the current 314 million users in 2024.

AI's Growing Impact on Observability

As AI improves and strengthens various product innovations and technology functions, it's also influencing and infiltrating the observability space. Observability, a practice used by ITOps and engineering teams to improve digital resilience through lowering the cost of unplanned downtime, provides greater visibility across data, workflows and one's infrastructure as a whole. Just because a server is happy, doesn't mean customers are happy. Observability helps translate technical stability into customer satisfaction and business success and AI amplifies this by driving continuous improvement at scale.

Defining what good looks like can be challenging for customers, requiring time and effort. For example, developers often rely on historical data to determine if an API call should take 10 or 100 milliseconds, then observing performance and setting alerts based on manual thresholds. With AI, developers can automate these tasks by analyzing data at scale to detect patterns and predict optimal performance, lifting the burden from teams.

Reduce Noise Through AIOps

AIOps, or artificial intelligence for IT operations, is a common way that AI is integrated into observability and a natural next step in mature practices. The main goals of AIOps are to accelerate detection, investigation and response times, increasing efficiency and reducing costs. It achieves this by applying machine learning models to intelligently group alerts from different tools that are otherwise noisy. For example, applying integrated ML allows teams to identify anomalies across multiple third party systems, identifying potential downstream impacts, such as increased CPU usage and database latency that otherwise might not have crossed manual alert thresholds.

Surface Insights and Accelerate Investigations Through AI Assistants

Another way organizations can strengthen their observability practice is by incorporating AI assistants. By embedding generative AI into workflows, ITOps and engineering teams can reduce the learning curve for non expert users and troubleshoot faster. Natural language processing (NLP) addresses key challenges like the lack of context for troubleshooting and slow root cause analysis often delayed by tribal knowledge. AI assistants, with intuitive commands and a low barrier to entry, can now answer environment specific questions, ranging from "How many services are running" to "What was the highest response time on the checkout service at the world's leading T-Shirt company, yesterday?" This empowers accessibility, speeds up troubleshooting and drives more efficient decision-making.

Predict and Mitigate Downtime

AI not only drives time savings but also delivers on cost reductions. The occurrence of unplanned downtime goes beyond immediate financial costs and has a lasting impact on a company's shareholder value, brand reputation, innovation velocity and customer trust. Research has shown that 40% of Chief Marketing Officers (CMOs) say downtime impacts customer lifetime value (CLV) and damages reseller and/or partner relationships.

By leveraging AI, companies can proactively minimize downtime and ultimately protect their bottom line. Organizations rely on digital platforms that handle millions of transactions daily and performance is beholden to teams that can adjust resources dynamically, preventing issues before they impact the business.

For example, when identifying recurring patterns of performance degradation linked to high call center volume, AI models can help forecast when the system is likely to experience strain that could lead to customer churn and frustration. With the right insights at the right time, teams can redistribute workloads or fine-tune application configurations before issues occur.

Complement Human Thinking

AI has a profound ability to complement human decision-making by delivering unparalleled speed and precision. However, it does lack the common sense and nuanced judgment that only human intelligence can provide. For ITOps and engineering teams, a single decision can make a big impact on observability outcomes and cause a ripple effect into the business. To ensure a strategic approach to decision-making, ITOps and engineering teams can leverage AI to form a dynamic partnership. AI accelerates insights while human reasoning ensures those insights are applied with context.

In summary, AI's ability to rapidly analyze vast amounts of data, detect anomalies and automate tasks is not only transforming observability, but also the people and processes that make up the practice. While the future holds many possibilities, one thing is clear: as AI becomes a core pillar of observability best practices, it will redefine how we ensure resiliency.

Mimi Shalash is Observability Advisor at Splunk, a Cisco company

The Latest

APMdigest's Predictions Series continues with 2026 DataOps Predictions — industry experts offer predictions on how DataOps and related technologies will evolve and impact business in 2026 ...

Industry experts offer predictions on how Cloud will evolve and impact business in 2026. Part 3 covers Multi, Hybrid and Private Cloud ...

Industry experts offer predictions on how Cloud will evolve and impact business in 2026. Part 2 covers FinOps, Sovereign Cloud and more ...

APMdigest's Predictions Series continues with 2026 Cloud Predictions — industry experts offer predictions on how Cloud will evolve and impact business in 2026. Part 1 covers AI's impact on cloud and cloud's impact on AI ...

Industry experts offer predictions on how NetOps and NPM will evolve and impact business in 2026. Part 2 covers NetOps challenges and the edge ...

APMdigest's Predictions Series continues with 2026 NetOps Predictions — industry experts offer predictions on how NetOps and Network Performance Management (NPM) will evolve and impact business in 2026 ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2026. Part 9 covers Observability of AI ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2026. Part 8 covers outages, downtime and availability ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2026. Part 7 covers Observability data ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2026. Part 6 covers OpenTelemetry ...

How AI Can Turbocharge Your Observability Practice

Mimi Shalash
Splunk

AI has transformed technologies, workflows and entire industries, reshaping how people scale performance analysis. Organizations are seeing that AI has the potential to dramatically strengthen innovation and employee productivity by automating manual tasks and quickly extracting valuable insights. This rapid enterprise adoption is showing no signs of stopping with global AI tool users expected to reach 729 million by 2030, in comparison to the current 314 million users in 2024.

AI's Growing Impact on Observability

As AI improves and strengthens various product innovations and technology functions, it's also influencing and infiltrating the observability space. Observability, a practice used by ITOps and engineering teams to improve digital resilience through lowering the cost of unplanned downtime, provides greater visibility across data, workflows and one's infrastructure as a whole. Just because a server is happy, doesn't mean customers are happy. Observability helps translate technical stability into customer satisfaction and business success and AI amplifies this by driving continuous improvement at scale.

Defining what good looks like can be challenging for customers, requiring time and effort. For example, developers often rely on historical data to determine if an API call should take 10 or 100 milliseconds, then observing performance and setting alerts based on manual thresholds. With AI, developers can automate these tasks by analyzing data at scale to detect patterns and predict optimal performance, lifting the burden from teams.

Reduce Noise Through AIOps

AIOps, or artificial intelligence for IT operations, is a common way that AI is integrated into observability and a natural next step in mature practices. The main goals of AIOps are to accelerate detection, investigation and response times, increasing efficiency and reducing costs. It achieves this by applying machine learning models to intelligently group alerts from different tools that are otherwise noisy. For example, applying integrated ML allows teams to identify anomalies across multiple third party systems, identifying potential downstream impacts, such as increased CPU usage and database latency that otherwise might not have crossed manual alert thresholds.

Surface Insights and Accelerate Investigations Through AI Assistants

Another way organizations can strengthen their observability practice is by incorporating AI assistants. By embedding generative AI into workflows, ITOps and engineering teams can reduce the learning curve for non expert users and troubleshoot faster. Natural language processing (NLP) addresses key challenges like the lack of context for troubleshooting and slow root cause analysis often delayed by tribal knowledge. AI assistants, with intuitive commands and a low barrier to entry, can now answer environment specific questions, ranging from "How many services are running" to "What was the highest response time on the checkout service at the world's leading T-Shirt company, yesterday?" This empowers accessibility, speeds up troubleshooting and drives more efficient decision-making.

Predict and Mitigate Downtime

AI not only drives time savings but also delivers on cost reductions. The occurrence of unplanned downtime goes beyond immediate financial costs and has a lasting impact on a company's shareholder value, brand reputation, innovation velocity and customer trust. Research has shown that 40% of Chief Marketing Officers (CMOs) say downtime impacts customer lifetime value (CLV) and damages reseller and/or partner relationships.

By leveraging AI, companies can proactively minimize downtime and ultimately protect their bottom line. Organizations rely on digital platforms that handle millions of transactions daily and performance is beholden to teams that can adjust resources dynamically, preventing issues before they impact the business.

For example, when identifying recurring patterns of performance degradation linked to high call center volume, AI models can help forecast when the system is likely to experience strain that could lead to customer churn and frustration. With the right insights at the right time, teams can redistribute workloads or fine-tune application configurations before issues occur.

Complement Human Thinking

AI has a profound ability to complement human decision-making by delivering unparalleled speed and precision. However, it does lack the common sense and nuanced judgment that only human intelligence can provide. For ITOps and engineering teams, a single decision can make a big impact on observability outcomes and cause a ripple effect into the business. To ensure a strategic approach to decision-making, ITOps and engineering teams can leverage AI to form a dynamic partnership. AI accelerates insights while human reasoning ensures those insights are applied with context.

In summary, AI's ability to rapidly analyze vast amounts of data, detect anomalies and automate tasks is not only transforming observability, but also the people and processes that make up the practice. While the future holds many possibilities, one thing is clear: as AI becomes a core pillar of observability best practices, it will redefine how we ensure resiliency.

Mimi Shalash is Observability Advisor at Splunk, a Cisco company

The Latest

APMdigest's Predictions Series continues with 2026 DataOps Predictions — industry experts offer predictions on how DataOps and related technologies will evolve and impact business in 2026 ...

Industry experts offer predictions on how Cloud will evolve and impact business in 2026. Part 3 covers Multi, Hybrid and Private Cloud ...

Industry experts offer predictions on how Cloud will evolve and impact business in 2026. Part 2 covers FinOps, Sovereign Cloud and more ...

APMdigest's Predictions Series continues with 2026 Cloud Predictions — industry experts offer predictions on how Cloud will evolve and impact business in 2026. Part 1 covers AI's impact on cloud and cloud's impact on AI ...

Industry experts offer predictions on how NetOps and NPM will evolve and impact business in 2026. Part 2 covers NetOps challenges and the edge ...

APMdigest's Predictions Series continues with 2026 NetOps Predictions — industry experts offer predictions on how NetOps and Network Performance Management (NPM) will evolve and impact business in 2026 ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2026. Part 9 covers Observability of AI ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2026. Part 8 covers outages, downtime and availability ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2026. Part 7 covers Observability data ...

In APMdigest's 2026 Observability Predictions Series, industry experts offer predictions on how Observability and related technologies will evolve and impact business in 2026. Part 6 covers OpenTelemetry ...