Skip to main content

5 Takeaways from the Observability Forecast for Retail and eCommerce

Nic Benders
New Relic

Seeing is believing, or in this case, seeing is understanding, according to New Relic's 2025 Observability Forecast for Retail and eCommerce report. Retailers who want to provide exceptional customer experiences while improving IT operations efficiency are leaning on observability.

As economic pressures intensify and customer expectations rise, the retail industry is undergoing a reset. To protect margins and deliver seamless omnichannel experiences, retailers must improve the efficiency and reliability of their IT and digital operations while also managing against complexity created by AI. Drawing on insights from 147 retail and eCommerce leaders, this report reveals how retailers use observability and key benefits.

Here are five key takeaways from the report:

1. AI shapes observability priorities

The data shows that 50% of leaders identified AI as the primary driver of deploying observability platforms in the retail and eCommerce industries. As retailers adopt AI to better connect with shoppers and personalize their experiences, the complexity of their digital estate increases by introducing new models, data pipelines, and dependencies that must be monitored alongside existing applications.

Beyond AI, retailers also cite governance, risk, compliance, cost management, and customer experience management as key drivers of observability adoption, reflecting the need for end-to-end visibility across increasingly interconnected systems.

2. Outages are a costly business risk

Outages are not just IT incidents; they are a business risk that can damage a brand. The report found that 31% of retail organizations report experiencing high-impact outages weekly. Retailers remain quicker than most industries at detecting outages, with a median time to detection of 30 minutes, yet the damage can still be devastating. The financial impact of outages is profound, with respondents citing a median cost of a critical business outage at $1 million per hour.  

Downtime, however, is only one part of the equation. Nearly 60% of respondents recognized that their engineering teams were losing innovation opportunities due to outages and incident response. Reducing incident frequency and downtime allows teams to redirect efforts toward innovation and business growth.

3. Digital experience monitoring is mission-critical

Monitoring offers insights into the digital customer experience and any issues that could impact it. To support seamless, omnichannel journeys, retail organizations are deploying a range of monitoring capabilities that keep customers engaged across every touchpoint. Specifically, they have prioritized database monitoring (67%), network monitoring (66%), alerts (65%), and dashboards (63%). Security also ranks highly, with 61% indicating they have deployed a security monitoring platform.

That focus on deeper visibility now extends to AI-driven systems, with AI monitoring adoption rising from 35% in 2024 to 55% in 2025.

4. Tool consolidation gains momentum

Retail organizations continue to consolidate observability tools to improve visibility across the software stack, prevent incidents, and increase operational efficiency. In 2025, the number of tools retail organizations used dropped from 5.9 just three years ago to 3.9. At the same time, complexity remains a persistent challenge, with 37% of respondents citing complex tool stacks as their primary obstacle to achieving full-stack observability. Having too many tools and the tools being too expensive fall closely behind as the next cited obstacles. This shift reflects a broader push to reduce tool sprawl as retailers manage increasingly distributed, omnichannel environments with fewer resources and tighter margins.

5. Observability makes life better (and provides business value)

For IT decision makers, observability delivers value beyond incident response. 41% of respondents said the technology helps satisfy key performance indicators (KPIs) while 36% said it drives business strategy. In-the-trenches practitioners said it increased productivity, enabling them to find and resolve issues faster (55%). It also reduced the guesswork associated with complex tech stacks (31%).

44% of respondents also noted observability increases operational efficiency, while another 43% reported improvements in system uptime and reliability.

Notably, observability is also delivering clear financial returns. Nearly half (46%) of retailers report an ROI of 2x or higher from their observability spend, reinforcing its role as a core business investment.

Retailers cannot afford business downtime or abandoned shopping carts due to poor customer experiences. As retailers navigate tighter margins, rising customer expectations, and increasingly complex digital environments, observability is proving essential for delivering resilience, efficiency, and business value.

Nic Benders is Chief Technical Strategist at New Relic

The Latest

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

5 Takeaways from the Observability Forecast for Retail and eCommerce

Nic Benders
New Relic

Seeing is believing, or in this case, seeing is understanding, according to New Relic's 2025 Observability Forecast for Retail and eCommerce report. Retailers who want to provide exceptional customer experiences while improving IT operations efficiency are leaning on observability.

As economic pressures intensify and customer expectations rise, the retail industry is undergoing a reset. To protect margins and deliver seamless omnichannel experiences, retailers must improve the efficiency and reliability of their IT and digital operations while also managing against complexity created by AI. Drawing on insights from 147 retail and eCommerce leaders, this report reveals how retailers use observability and key benefits.

Here are five key takeaways from the report:

1. AI shapes observability priorities

The data shows that 50% of leaders identified AI as the primary driver of deploying observability platforms in the retail and eCommerce industries. As retailers adopt AI to better connect with shoppers and personalize their experiences, the complexity of their digital estate increases by introducing new models, data pipelines, and dependencies that must be monitored alongside existing applications.

Beyond AI, retailers also cite governance, risk, compliance, cost management, and customer experience management as key drivers of observability adoption, reflecting the need for end-to-end visibility across increasingly interconnected systems.

2. Outages are a costly business risk

Outages are not just IT incidents; they are a business risk that can damage a brand. The report found that 31% of retail organizations report experiencing high-impact outages weekly. Retailers remain quicker than most industries at detecting outages, with a median time to detection of 30 minutes, yet the damage can still be devastating. The financial impact of outages is profound, with respondents citing a median cost of a critical business outage at $1 million per hour.  

Downtime, however, is only one part of the equation. Nearly 60% of respondents recognized that their engineering teams were losing innovation opportunities due to outages and incident response. Reducing incident frequency and downtime allows teams to redirect efforts toward innovation and business growth.

3. Digital experience monitoring is mission-critical

Monitoring offers insights into the digital customer experience and any issues that could impact it. To support seamless, omnichannel journeys, retail organizations are deploying a range of monitoring capabilities that keep customers engaged across every touchpoint. Specifically, they have prioritized database monitoring (67%), network monitoring (66%), alerts (65%), and dashboards (63%). Security also ranks highly, with 61% indicating they have deployed a security monitoring platform.

That focus on deeper visibility now extends to AI-driven systems, with AI monitoring adoption rising from 35% in 2024 to 55% in 2025.

4. Tool consolidation gains momentum

Retail organizations continue to consolidate observability tools to improve visibility across the software stack, prevent incidents, and increase operational efficiency. In 2025, the number of tools retail organizations used dropped from 5.9 just three years ago to 3.9. At the same time, complexity remains a persistent challenge, with 37% of respondents citing complex tool stacks as their primary obstacle to achieving full-stack observability. Having too many tools and the tools being too expensive fall closely behind as the next cited obstacles. This shift reflects a broader push to reduce tool sprawl as retailers manage increasingly distributed, omnichannel environments with fewer resources and tighter margins.

5. Observability makes life better (and provides business value)

For IT decision makers, observability delivers value beyond incident response. 41% of respondents said the technology helps satisfy key performance indicators (KPIs) while 36% said it drives business strategy. In-the-trenches practitioners said it increased productivity, enabling them to find and resolve issues faster (55%). It also reduced the guesswork associated with complex tech stacks (31%).

44% of respondents also noted observability increases operational efficiency, while another 43% reported improvements in system uptime and reliability.

Notably, observability is also delivering clear financial returns. Nearly half (46%) of retailers report an ROI of 2x or higher from their observability spend, reinforcing its role as a core business investment.

Retailers cannot afford business downtime or abandoned shopping carts due to poor customer experiences. As retailers navigate tighter margins, rising customer expectations, and increasingly complex digital environments, observability is proving essential for delivering resilience, efficiency, and business value.

Nic Benders is Chief Technical Strategist at New Relic

The Latest

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...