
The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned. Grafana Labs recently surveyed 1,300+ practitioners around the world for Grafana Labs' annual Observability Survey, and 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability.
That's not a contradiction, it's a signal. The people closest to this technology, the ones who actually want to use it, are the ones most loudly demanding that it show its work. 95% of respondents said it's important for AI to explain its reasoning.
We've spent years in observability fighting alert fatigue: the flood of signals with no clear context, no prioritization, no "here's why this matters." Alert fatigue is still the single biggest obstacle to faster incident response, cited by nearly a third of practitioners in our survey. AI that behaves like a black box doesn't solve that problem. It compounds it. You've traded one source of noise for another.
The most common barrier to AI adoption in our survey wasn't cost, and it wasn't technical complexity; it was too much manual input of required context. In other words, practitioners are being asked to do significant work just to make the AI useful. If AI is creating new toil in place of old toil, we haven't made progress; we've just moved the bottleneck.
What practitioners actually want is AI that reduces the cognitive load of on-call work, not AI that adds to it. They want a system that can say: here is the anomaly, here is why I flagged it, here is the likely cause, and here is what I'd recommend (with the reasoning visible at every step). And while that may not be surprising, the question of autonomy is where things get interesting. 77% of respondents support AI taking autonomous actions, but 15% don't yet trust AI to act on their behalf, and another 8% see no value in it at all. That's a meaningful pocket of resistance, and it deserves to be taken seriously rather than steamrolled by hype.
The path to autonomous AI in observability runs directly through explainability. You cannot ask a team to trust a system that won't explain how it reached its conclusion. Especially not in incident response, where the cost of a wrong call (a missed alert, a misdiagnosed root cause, an automated action that makes things worse) is measured in downtime, revenue, and team trust.
The vendors and teams that get this right won't be the ones with the most sophisticated models. They'll be the ones who treat explainability as a first-class engineering requirement, not an afterthought. The AI that wins in observability will be the AI that practitioners can actually reason about, override when necessary, and learn from over time.