No Escalations ≠ No Work: Why Visibility in DevOps Matters More Now That AI Is Accelerating Everything

April 20, 2026

Alka Malik

Ivanti

Learn more about Ivanti

The quietest week your engineering team has ever had might also be its best.

No alarms going off. No escalations. No frantic Teams or Slack threads at 2 a.m. Everything humming along exactly as it should. And somewhere in a leadership meeting, someone looks at the metrics dashboard, sees a flat line of incidents and says: "Seems like things are pretty calm over there. Do we really need all those people?"

And there we go. It's the corporate equivalent of, "The medicine is working, so I don't need to take it anymore."

I've spent many years in engineering, and this pattern keeps repeating. The better a DevOps team gets at preventing problems, the more invisible their work becomes. And invisible work is dangerously easy to undervalue.

That was already true before AI entered the picture. Now that 90% of developers report using AI at work — a 14% jump over last year, according to Google's 2025 DORA research — the pace of change is accelerating, the volume of code is increasing and the gap between what teams are doing and what leadership can see is growing wider by the month.

Quiet Systems Are Expensive to Maintain

There's a misconception that a lack of escalations means a lack of effort. The opposite is almost always true. When production is stable, it's because someone spotted a memory leak before it cascaded. Someone else automated a failover that fired at 3 a.m. without waking anyone up. A third person spent two weeks refactoring a deployment pipeline so releases stopped breaking on Fridays.

None of that shows up in an incident report. None of them trigger a heroic war room. And none of it gets the same organizational attention as the DevOps team that spent 72 hours recovering from an outage — even though preventing the outage was harder.

What Makes AI-Assisted Teams Succeed

DORA's 2025 research identifies seven foundational capabilities that determine whether AI adoption helps or hurts an organization. Two stand out for engineering leaders managing the invisible-work problem.

A clear and communicated AI stance

When organizations establish and socialize explicit policies on how developers are expected and permitted to use AI tools, the research found — with high confidence — that AI's positive effect on individual effectiveness and organizational performance is amplified, and friction decreases. Without that clarity, developers either hold back out of fear of overstepping or use AI in ways they shouldn't. Neither is productive.

As I told my engineering leadership team recently: our mandate is non-negotiable. We must accelerate execution and productivity without compromising reliability, scalability, or security. But that mandate only works if every developer knows exactly which tools are sanctioned, what the guardrails look like and where the boundaries are. Ambiguity kills adoption.

A quality internal platform

This is a headline-worthy finding: data shows that the positive effect of AI adoption on organizational performance depends on the quality of the internal platform. When platform quality is low, AI adoption has a negligible effect on organizational performance. When platform quality is high, the effect is strong and positive. Gartner's January 2026 Platform Engineering Maturity Model reinforces this, noting that platform engineering is now foundational for speed, consistency, governance and AI readiness. Their data shows 44% of software engineering leaders report skills gaps specifically in AI, platform engineering and security.

This is why we're investing in treating our platform as a product — with a defined roadmap, developer personas and structured feedback loops — rather than a collection of tools that somebody maintains on the side.

How to Make Invisible Impact Visible

If your best work is prevention — and increasingly, if your best work involves knowing how to deploy AI effectively within a complex delivery system — you need a measurement strategy that captures it. Start with a simple, leader-owned operating rhythm:

1. Weekly: Review a small set of SLIs/SLO error-budget signals and the top anomalies so you can ask "what did we catch early?" before customers felt.

2. Monthly: Inspect trends with engineering and platform leads to connect delivery speed to stability (and agree on one improvement focus).

3. Quarterly: Tie reliability outcomes to customer and business health, then fund the highest-leverage preventative work in the roadmap.

4. Start here: Pick one customer-critical service, define 1–3 SLIs, and publish an SLO dashboard that leadership reviews on a calendar.

SLOs and SLIs tied to customer health

Once you have delivery baselines, service level objectives and service level indicators anchor your team's performance to something leadership already cares about: customer experience. When your SLO dashboard shows 99.95% availability over the last quarter, that number reflects hundreds of small interventions that kept it there. Tie SLIs to business metrics wherever possible. Latency on checkout flow. Error rates on API calls from your biggest customers. Response time on authentication. These make proactive work legible to people who don't read deployment logs.

More simply: anomaly detection is the early-warning layer that makes prevention show up in weekly reviews. It surfaces weak signals before customers feel them and turns "nothing happened" into a measurable outcome.

Fire Prevention > Firefighting

In my experience, engineering culture seems to have a hero problem. We celebrate the person who stayed up all night fixing a production outage. We rarely celebrate the person who spent a quiet Tuesday tuning alerts, so the outage never happened.

This isn't abstract. Gartner research indicates that 87% of businesses experience revenue decreases for every hour of downtime. The proactive work that prevents those hours from happening is worth real money. But if your promotion criteria and performance reviews only capture incident response, you're incentivizing the wrong behavior. You're telling your team: let things break, then fix them heroically.

A few concrete shifts help:

Include prevention metrics in performance reviews. How many incidents did this person's work prevent? What reliability improvements did they drive? How did their automation reduce manual toil for the team?
Make proactive work a first-class citizen in sprint planning. Reliability engineering, observability improvements and documentation shouldn't be "tech debt" items that get deprioritized every cycle. Build them into the plan with the same weight as feature work.
Report on what didn't happen. Sounds counterintuitive, but framing quarterly reviews around "here's what our monitoring caught before it reached customers" is powerful. It puts the invisible work into a context that leadership understands.
Measure AI's actual delivery impact, not just its perceived productivity boost. Given the gap between how productive developers feel when using AI and what the delivery metrics show, track both. Perception data is valuable. So is cycle time, change failure rate and rework rate. If those diverge, you have a coaching opportunity (as opposed to a tool problem.

Your Quiet Dashboard Is Really Shouting at You

A flat incident graph and an empty escalation queue aren't signs that your DevOps team has too little to do. They're evidence that your team is doing exactly the right things — and doing them well.

The engineering leader's job is to make that evidence visible. Not to justify headcount or pad reports, but because the work of prevention deserves the same organizational recognition as the work of response. And as AI accelerates the pace of delivery — generating more code, shipping more changes and creating more surface area for things to go wrong — the value of prevention rises while attribution gets harder. It's so important (and valuable!) for teams to prevent problems before those problems reach customers, and to measure that work in ways leadership can actually see.

Google CEO Sundar Pichai's mentor, the late Bill Campbell, used to ask him one question every week: "What ties did you break this week?" As engineering leaders, maybe we should be asking ourselves a different version of that question: What fires did we prevent this week — and can we prove it?

The answer makes a difference. Make sure it's on record.

Alka Malik is SVP of Engineering at Ivanti

Hot Topics

AI/ML

DevOps

Observability

The Latest

Nine in Ten Enterprises Plan Cloud Data Repatriation amid Rising Cloud Costs and Data Sovereignty Mandates

June 05, 2026

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Why ITOps Need Right-Sized AI, Not Bigger Models

June 04, 2026

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

The End of Reactive DevOps: AI-Driven Observability for Zero-Defect Digital Systems

June 03, 2026

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

5 Takeaways from the Observability Forecast for Media and Entertainment

June 02, 2026

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Why Moving Off the Cloud Is the Easy Part and What Comes Next Is Where Things Get Hard

June 01, 2026

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

MEAN TIME TO INSIGHT Podcast - Episode 24: Network Observability Tool Sprawl

May 29, 2026

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ...

Capacity Isn't a Guess: Observability-Driven Sizing for On-Prem Databases

May 28, 2026

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

5 Security Principles Every Entrepreneur Should Apply to Leadership

May 27, 2026

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

Signs It May Be Time to Reassess Your IT Infrastructure Strategy

May 26, 2026

Many organizations assumed their infrastructure strategy was settled. It had been implemented, optimized and built into long-term plans. Recent changes in technology and vendor consolidation are forcing a second look. Cloud outages and licensing changes have exposed how much dependency exists on a small number of platforms. As a result, organizations are reevaluating whether those decisions still hold up under current conditions ...

Enterprise Edge AI Reaches Inflection Point

May 22, 2026

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

No Escalations ≠ No Work: Why Visibility in DevOps Matters More Now That AI Is Accelerating Everything

April 20, 2026

Alka Malik

Ivanti

Learn more about Ivanti

The quietest week your engineering team has ever had might also be its best.

And there we go. It's the corporate equivalent of, "The medicine is working, so I don't need to take it anymore."

Quiet Systems Are Expensive to Maintain

What Makes AI-Assisted Teams Succeed

A clear and communicated AI stance

A quality internal platform

How to Make Invisible Impact Visible

1. Weekly: Review a small set of SLIs/SLO error-budget signals and the top anomalies so you can ask "what did we catch early?" before customers felt.

2. Monthly: Inspect trends with engineering and platform leads to connect delivery speed to stability (and agree on one improvement focus).

3. Quarterly: Tie reliability outcomes to customer and business health, then fund the highest-leverage preventative work in the roadmap.

4. Start here: Pick one customer-critical service, define 1–3 SLIs, and publish an SLO dashboard that leadership reviews on a calendar.

SLOs and SLIs tied to customer health

Fire Prevention > Firefighting

A few concrete shifts help:

Include prevention metrics in performance reviews. How many incidents did this person's work prevent? What reliability improvements did they drive? How did their automation reduce manual toil for the team?
Make proactive work a first-class citizen in sprint planning. Reliability engineering, observability improvements and documentation shouldn't be "tech debt" items that get deprioritized every cycle. Build them into the plan with the same weight as feature work.
Report on what didn't happen. Sounds counterintuitive, but framing quarterly reviews around "here's what our monitoring caught before it reached customers" is powerful. It puts the invisible work into a context that leadership understands.
Measure AI's actual delivery impact, not just its perceived productivity boost. Given the gap between how productive developers feel when using AI and what the delivery metrics show, track both. Perception data is valuable. So is cycle time, change failure rate and rework rate. If those diverge, you have a coaching opportunity (as opposed to a tool problem.

Your Quiet Dashboard Is Really Shouting at You

A flat incident graph and an empty escalation queue aren't signs that your DevOps team has too little to do. They're evidence that your team is doing exactly the right things — and doing them well.

The answer makes a difference. Make sure it's on record.

Alka Malik is SVP of Engineering at Ivanti

Hot Topics

AI/ML

DevOps

Observability

The Latest

Nine in Ten Enterprises Plan Cloud Data Repatriation amid Rising Cloud Costs and Data Sovereignty Mandates

June 05, 2026

Why ITOps Need Right-Sized AI, Not Bigger Models

June 04, 2026

The End of Reactive DevOps: AI-Driven Observability for Zero-Defect Digital Systems

June 03, 2026

5 Takeaways from the Observability Forecast for Media and Entertainment

June 02, 2026

Why Moving Off the Cloud Is the Easy Part and What Comes Next Is Where Things Get Hard

June 01, 2026

MEAN TIME TO INSIGHT Podcast - Episode 24: Network Observability Tool Sprawl

May 29, 2026

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ...

Capacity Isn't a Guess: Observability-Driven Sizing for On-Prem Databases

May 28, 2026

5 Security Principles Every Entrepreneur Should Apply to Leadership

May 27, 2026

Signs It May Be Time to Reassess Your IT Infrastructure Strategy

May 26, 2026

Enterprise Edge AI Reaches Inflection Point

May 22, 2026

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Report

Featured White Paper

Featured eBook

Featured Webinar

Featured eBook

Featured Webinar

Featured Free Tool

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Report

Featured Report

Featured Webinar

Featured White Paper

Featured Free Trial

Featured Free Trial

Featured eBook

Featured Webinar

Featured White Paper

Featured Free Trial

Featured Webinar

Featured Webinar

Featured Free Trial

Featured Webinar

Featured White Paper

Featured Webinar

Featured Webinar

Featured eBook

Featured White Paper

Featured Free Trial

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Free Trial

Featured Webinar

Featured Webinar

Featured eBook

Featured eBook

Featured Webinar

Featured Free Trial

Featured Free Trial

Featured Webinar

Featured Webinar

Featured Report

Featured White Paper

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar

Featured Webinar