Skip to main content

Downtime in a Downturn Could Mean Customer Churn

Phil Tee

The last year has been challenging for Tech. Everyone in the industry, from IT and DevOps leaders to field technicians, grapples with recessionary pressures like inflation and rising interest rates in their personal life. And thanks to a never-ending barrage of stories about high-profile layoffs, they are also keenly aware that Tech is experiencing an aggravated downturn.

For many IT leaders, the well-reasoned response to these stories is to locate cost-cutting opportunities in their organization. Ultimately, an economic softening will encourage managers to audit their ITOps tech stack. This is a reasonable first step since the average engineering team manages more than 16 monitoring tools alone.

However, IT leaders must ensure their tool consolidation process is strategic. After all, many solutions are mission-critical — especially during an economic downturn, when hitting key metrics like revenue and availability becomes necessary for business continuity. The best rule of thumb is to consider which tools provide actionable insights and ROI without wasting technicians' time. This benchmark for success allows leaders to cut ties with superfluous solutions and double down on those that map back to critical KPIs like system performance and operational efficiency.

An array of tools purport to maintain availability — the trick is sorting through the noise to find the right one. Let us discuss why availability is so important and then unpack the ROI of deploying Artificial Intelligence for IT Operations (AIOps) during an economic downturn.

Maintaining Availability Has Become More Important Than Ever

Over half the world's GDP (60%) is digitized as of 2019. That means organizations with improper digital infrastructure will repeatedly lose out on revenue opportunities. And in a downturn, revenue-generating opportunities are not simply competitive differentiators — they are the difference between sinking and swimming.

True, revenue is a guiding KPI regardless of macroeconomic conditions. But the recent economic softening has refocused efforts from a "growth at all costs" mindset to a "generate revenue efficiently" perspective. Now, organizations are buckling down to the basics — and providing consumers with a reliable online destination to interact with a brand and its products is downright critical.

That is where availability comes in. Availability is the glue that binds all digital interfaces together. Defined by maximum system performance and uptime, availability is achieved through rigorous behind-the-scenes engineering work. AIOps are an essential part of this equation because these tools reduce an organization's mean time to detect (MTTD) and mean time to recover (MTTR) by simplifying, collating and escalating data errors before they create downtime.

Let us use an example to illustrate the importance of reduced MTTX. If a top broadcast network experiences an outage during a major sporting event, they stand to lose millions of viewers — and, as a result, millions of dollars in ad revenue. But if that broadcast network has deployed AIOps, they can expediently identify the nature of the error (low MTTD) and resolve it within 30 seconds (low MTTR). Compare that resolution to a network without AIOps, which may experience an outage measured in minutes not seconds. This extended outage could immediately cost the network millions of dollars, not to mention millions more in lost customer loyalty and damaged brand reputation.

In an economically fraught environment, the losses associated with such an outage are more likely to become exacerbated. Hence, maintaining availability is not a luxury but a necessity.

AIOps Goes Beyond Simple Event Management

Availability, uptime and system performance are leading DevOps concerns. Consequently, many vendors advertise that their monitoring tool can improve these vectors in isolation, but this is not so. Monitoring tools are foundational for a tech stack, but they are fundamentally incapable of identifying and escalating data errors across all telemetry points. Only AIOps solutions that ingest disparate data from all devices, networks and tools will provide a complete overhead of the incident lifecycle. Furthermore, top AIOps solutions rely on machine learning (ML) to grow with their system and fill contextual gaps.

AIOps tools are superior to point solutions because their AI-based algorithms can parse thousands of incidents to determine which are relevant. Consider that any data state change creates an incident, yet data is inherently ephemeral, and only a select few changes indicate an actual system error. AIOps reduce the time technicians spend combing over data by eradicating non-harmful events and escalating the rest to the appropriate party — all with minimal supervision.

And when technicians need to step in, AIOps-based systems provide them with context-rich event tickets that explain the data issue in detail. This provides ample time for technicians to address the problem and return to revenue-generating responsibilities like improving the user experience (UX) and driving down technical debt. During an economic softening, the ROI here is even more apparent, especially given the extended tech talent crunch that continues to leave IT and DevOps teams struggling to fill labor-related gaps.

Of course, budget cuts and hiring freezes are only natural responses to concerns about fluctuations in economic stability. But IT and DevOps leaders should carefully consider the ROI behind each solution they cut — and adopt — during an economic softening.

For example, does a solution of interest provide excess data to interpret, or does it also understand and act on that data?

Does a solution reduce monotonous labor needs?

And, most importantly, does it provide revenue-generating opportunities like increased uptime and availability?

This line of questioning will ultimately demonstrate that certain tools are unnecessary during an economic downturn while others are more critical than ever. But, in general, leaders should treat availability as their guiding light when auditing their tech stack. Doing so will leave their organization better positioned to excel in the months ahead.

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

Downtime in a Downturn Could Mean Customer Churn

Phil Tee

The last year has been challenging for Tech. Everyone in the industry, from IT and DevOps leaders to field technicians, grapples with recessionary pressures like inflation and rising interest rates in their personal life. And thanks to a never-ending barrage of stories about high-profile layoffs, they are also keenly aware that Tech is experiencing an aggravated downturn.

For many IT leaders, the well-reasoned response to these stories is to locate cost-cutting opportunities in their organization. Ultimately, an economic softening will encourage managers to audit their ITOps tech stack. This is a reasonable first step since the average engineering team manages more than 16 monitoring tools alone.

However, IT leaders must ensure their tool consolidation process is strategic. After all, many solutions are mission-critical — especially during an economic downturn, when hitting key metrics like revenue and availability becomes necessary for business continuity. The best rule of thumb is to consider which tools provide actionable insights and ROI without wasting technicians' time. This benchmark for success allows leaders to cut ties with superfluous solutions and double down on those that map back to critical KPIs like system performance and operational efficiency.

An array of tools purport to maintain availability — the trick is sorting through the noise to find the right one. Let us discuss why availability is so important and then unpack the ROI of deploying Artificial Intelligence for IT Operations (AIOps) during an economic downturn.

Maintaining Availability Has Become More Important Than Ever

Over half the world's GDP (60%) is digitized as of 2019. That means organizations with improper digital infrastructure will repeatedly lose out on revenue opportunities. And in a downturn, revenue-generating opportunities are not simply competitive differentiators — they are the difference between sinking and swimming.

True, revenue is a guiding KPI regardless of macroeconomic conditions. But the recent economic softening has refocused efforts from a "growth at all costs" mindset to a "generate revenue efficiently" perspective. Now, organizations are buckling down to the basics — and providing consumers with a reliable online destination to interact with a brand and its products is downright critical.

That is where availability comes in. Availability is the glue that binds all digital interfaces together. Defined by maximum system performance and uptime, availability is achieved through rigorous behind-the-scenes engineering work. AIOps are an essential part of this equation because these tools reduce an organization's mean time to detect (MTTD) and mean time to recover (MTTR) by simplifying, collating and escalating data errors before they create downtime.

Let us use an example to illustrate the importance of reduced MTTX. If a top broadcast network experiences an outage during a major sporting event, they stand to lose millions of viewers — and, as a result, millions of dollars in ad revenue. But if that broadcast network has deployed AIOps, they can expediently identify the nature of the error (low MTTD) and resolve it within 30 seconds (low MTTR). Compare that resolution to a network without AIOps, which may experience an outage measured in minutes not seconds. This extended outage could immediately cost the network millions of dollars, not to mention millions more in lost customer loyalty and damaged brand reputation.

In an economically fraught environment, the losses associated with such an outage are more likely to become exacerbated. Hence, maintaining availability is not a luxury but a necessity.

AIOps Goes Beyond Simple Event Management

Availability, uptime and system performance are leading DevOps concerns. Consequently, many vendors advertise that their monitoring tool can improve these vectors in isolation, but this is not so. Monitoring tools are foundational for a tech stack, but they are fundamentally incapable of identifying and escalating data errors across all telemetry points. Only AIOps solutions that ingest disparate data from all devices, networks and tools will provide a complete overhead of the incident lifecycle. Furthermore, top AIOps solutions rely on machine learning (ML) to grow with their system and fill contextual gaps.

AIOps tools are superior to point solutions because their AI-based algorithms can parse thousands of incidents to determine which are relevant. Consider that any data state change creates an incident, yet data is inherently ephemeral, and only a select few changes indicate an actual system error. AIOps reduce the time technicians spend combing over data by eradicating non-harmful events and escalating the rest to the appropriate party — all with minimal supervision.

And when technicians need to step in, AIOps-based systems provide them with context-rich event tickets that explain the data issue in detail. This provides ample time for technicians to address the problem and return to revenue-generating responsibilities like improving the user experience (UX) and driving down technical debt. During an economic softening, the ROI here is even more apparent, especially given the extended tech talent crunch that continues to leave IT and DevOps teams struggling to fill labor-related gaps.

Of course, budget cuts and hiring freezes are only natural responses to concerns about fluctuations in economic stability. But IT and DevOps leaders should carefully consider the ROI behind each solution they cut — and adopt — during an economic softening.

For example, does a solution of interest provide excess data to interpret, or does it also understand and act on that data?

Does a solution reduce monotonous labor needs?

And, most importantly, does it provide revenue-generating opportunities like increased uptime and availability?

This line of questioning will ultimately demonstrate that certain tools are unnecessary during an economic downturn while others are more critical than ever. But, in general, leaders should treat availability as their guiding light when auditing their tech stack. Doing so will leave their organization better positioned to excel in the months ahead.

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...