Skip to main content

Data Downtime Nearly Doubled Year Over Year

Data downtime — periods of time when an organization's data is missing, wrong or otherwise inaccurate — nearly doubled year over year (1.89x), according to the State of Data Quality report from Monte Carlo.


The Wakefield Research survey, which was commissioned by Monte Carlo and polled 200 data professionals in March 2023, found that three critical factors contributed to this increase in data downtime. These factors included:

■ A rise in monthly data incidents, from 59 in 2022 to 67 in 2023.

■ 68% of respondents reported an average time of detection for data incidents of four hours or more, up from 62% of respondents in 2022.

■ A 166% increase in average time to resolution, rising to an average of 15 hours per incident across respondents.

More than half of respondents reported 25% or more of revenue was subjected to data quality issues. The average percentage of impacted revenue jumped to 31%, up from 26% in 2022. Additionally, an astounding 74% reported business stakeholders identify issues first, "all or most of the time," up from 47% in 2022.

These findings suggest data quality remains among the biggest problems facing data teams, with bad data having more severe repercussions on an organization's revenue and data trust than in years prior.

The survey also suggests data teams are making a tradeoff between data downtime and the amount of time spent on data quality as their datasets grow.

For instance, organizations with fewer tables reported spending less time on data quality than their peers with more tables, but their average time to detection and average time to resolution was comparatively higher. Conversely, organizations with more tables reported lower average time to detection and average time to resolution, but spent a greater percentage of their team's time to do so.

■ Respondents that spent more than 50% of their time on data quality had more tables (average 2,571) compared to respondents that spent less than 50% of their time on data quality (average 208).

■ Respondents that took less than 4 hours to detect an issue had more tables (average 1,269) than those who took longer than 4 hours to detect an issue (average 346).

■ Respondents that took less than 4 hours to resolve an issue had more tables (average 1,172) than those who took longer than 4 hours to resolve an issue (average 330).

"These results show teams having to make a lose-lose choice between spending too much time solving for data quality or suffering adverse consequences to their bottom line," said Barr Moses, CEO and co-founder of Monte Carlo. "In this economic climate, it's more urgent than ever for data leaders to turn this lose-lose into a win-win by leveraging data quality solutions that will lower BOTH the amount of time teams spend tackling data downtime and mitigating its consequences. As an industry, we need to prioritize data trust to optimize the potential of our data investments."

The survey revealed additional insights on the state of data quality management, including:

■ 50% of respondents reported data engineering is primarily responsible for data quality, compared to:
- 22% for data analysts
- 9% for software engineering
- 7% for data reliability engineering
- 6% for analytics engineering
- 5% for the data governance team
- 3% for non-technical business stakeholders

■ Respondents averaged 642 tables across their data lake, lakehouse, or warehouse environments.

■ Respondents reported having an average of 24 dbt models, and 41% reported having 25 or more dbt models.

■ Respondents averaged 290 manually-written tests across their data pipelines.

■ The number one reason for launching a data quality initiative was that the data organization identified data quality as a need (28%), followed by a migration or modernization of the data platform or systems (23%).

"Data testing remains data engineers' number one defense against data quality issues — and that's clearly not cutting it," said Lior Gavish, Monte Carlo CTO and Co-Founder. "Incidents fall through the cracks, stakeholders are the first to identify problems, and teams fall further behind. Leaning into more robust incident management processes and automated, ML-driven approaches like data observability is the future of data engineering at scale."

Hot Topics

The Latest

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

Data Downtime Nearly Doubled Year Over Year

Data downtime — periods of time when an organization's data is missing, wrong or otherwise inaccurate — nearly doubled year over year (1.89x), according to the State of Data Quality report from Monte Carlo.


The Wakefield Research survey, which was commissioned by Monte Carlo and polled 200 data professionals in March 2023, found that three critical factors contributed to this increase in data downtime. These factors included:

■ A rise in monthly data incidents, from 59 in 2022 to 67 in 2023.

■ 68% of respondents reported an average time of detection for data incidents of four hours or more, up from 62% of respondents in 2022.

■ A 166% increase in average time to resolution, rising to an average of 15 hours per incident across respondents.

More than half of respondents reported 25% or more of revenue was subjected to data quality issues. The average percentage of impacted revenue jumped to 31%, up from 26% in 2022. Additionally, an astounding 74% reported business stakeholders identify issues first, "all or most of the time," up from 47% in 2022.

These findings suggest data quality remains among the biggest problems facing data teams, with bad data having more severe repercussions on an organization's revenue and data trust than in years prior.

The survey also suggests data teams are making a tradeoff between data downtime and the amount of time spent on data quality as their datasets grow.

For instance, organizations with fewer tables reported spending less time on data quality than their peers with more tables, but their average time to detection and average time to resolution was comparatively higher. Conversely, organizations with more tables reported lower average time to detection and average time to resolution, but spent a greater percentage of their team's time to do so.

■ Respondents that spent more than 50% of their time on data quality had more tables (average 2,571) compared to respondents that spent less than 50% of their time on data quality (average 208).

■ Respondents that took less than 4 hours to detect an issue had more tables (average 1,269) than those who took longer than 4 hours to detect an issue (average 346).

■ Respondents that took less than 4 hours to resolve an issue had more tables (average 1,172) than those who took longer than 4 hours to resolve an issue (average 330).

"These results show teams having to make a lose-lose choice between spending too much time solving for data quality or suffering adverse consequences to their bottom line," said Barr Moses, CEO and co-founder of Monte Carlo. "In this economic climate, it's more urgent than ever for data leaders to turn this lose-lose into a win-win by leveraging data quality solutions that will lower BOTH the amount of time teams spend tackling data downtime and mitigating its consequences. As an industry, we need to prioritize data trust to optimize the potential of our data investments."

The survey revealed additional insights on the state of data quality management, including:

■ 50% of respondents reported data engineering is primarily responsible for data quality, compared to:
- 22% for data analysts
- 9% for software engineering
- 7% for data reliability engineering
- 6% for analytics engineering
- 5% for the data governance team
- 3% for non-technical business stakeholders

■ Respondents averaged 642 tables across their data lake, lakehouse, or warehouse environments.

■ Respondents reported having an average of 24 dbt models, and 41% reported having 25 or more dbt models.

■ Respondents averaged 290 manually-written tests across their data pipelines.

■ The number one reason for launching a data quality initiative was that the data organization identified data quality as a need (28%), followed by a migration or modernization of the data platform or systems (23%).

"Data testing remains data engineers' number one defense against data quality issues — and that's clearly not cutting it," said Lior Gavish, Monte Carlo CTO and Co-Founder. "Incidents fall through the cracks, stakeholders are the first to identify problems, and teams fall further behind. Leaning into more robust incident management processes and automated, ML-driven approaches like data observability is the future of data engineering at scale."

Hot Topics

The Latest

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...