Skip to main content

Data Engineers Spend 2 Days Per Week Firefighting Bad Data

Data professionals are spending 40% of their time evaluating or checking data quality and that poor data quality impacts 26% of their companies' revenue, according to The State of Data Quality 2022, a report commissioned by Monte Carlo and conducted by Wakefield Research.

The survey found that 75% of participants take four or more hours to detect a data quality incident and about half said it takes an average of nine hours to resolve the issue once identified. Worse, 58% said the total number of incidents has increased somewhat or greatly over the past year, often as a result of more complex pipelines, bigger data teams, greater volumes of data, and other factors.

Today, the average organization experiences about 61 data-related incidents per month, each of which takes an average of 13 hours to identify and resolve. This adds up to an average of about 793 hours per month, per company.

However, 61 incidents only represents the number of incidents known to respondents.

"In the mid-2010s, organizations were shocked to learn that their data scientists were spending about 60% of their time just getting data ready for analysis," said Barr Moses, Monte Carlo CEO and co-founder. "Now, even with more mature data organizations and advanced stacks, data teams are still wasting 40% of their time troubleshooting data downtime. Not only is this wasting valuable engineering time, but it's also costing precious revenue and diverting attention away from initiatives that move the needle for the business. These results validate that data reliability is one of the biggest and most urgent problems facing today's data and analytics leaders."

Nearly half of respondent organizations measure data quality most often by the number of customer complaints their company receives, highlighting the ad hoc - and reputation damaging - nature of this important element of modern data strategy.

The Cost of Data Downtime

"Garbage in, garbage out" aptly describes the impact data quality has on data analytics and machine learning. If the data is unreliable, so are the insights derived from it.

In fact, on average, respondents said bad data impacts 26% of their revenue. This validates and supplements other industry studies that have uncovered the high cost of bad data. For example, Gartner estimates poor data quality costs organizations an average $12.9 million every year.

Nearly half said business stakeholders are impacted by issues the data team doesn't catch most of the time, or all the time.

In fact, according to the survey, respondents that conducted at least three different types of data tests for distribution, schema, volume, null or freshness anomalies at least once a week suffered fewer data incidents (46) on average than respondents with a less rigorous testing regime (61). However, testing alone was insufficient and stronger testing did not have a significant correlation with reducing the level of impact on revenue or stakeholders.

"Testing helps reduce data incidents, but no human being is capable of anticipating and writing a test for every way data pipelines can break. And if they could, it wouldn't be possible to scale across their always changing environment," said Lior Gavish, Monte Carlo CTO and co-founder. "Machine learning-powered anomaly monitoring and alerting through data observability can help teams close these coverage gaps and save data engineers' time."

Hot Topics

The Latest

IT and line-of-business teams are increasingly aligned in their efforts to close the data gap and drive greater collaboration to alleviate IT bottlenecks and offload growing demands on IT teams, according to The 2025 Automation Benchmark Report: Insights from IT Leaders on Enterprise Automation & the Future of AI-Driven Businesses from Jitterbit ...

A large majority (86%) of data management and AI decision makers cite protecting data privacy as a top concern, with 76% of respondents citing ROI on data privacy and AI initiatives across their organization, according to a new Harris Poll from Collibra ...

According to Gartner, Inc. the following six trends will shape the future of cloud over the next four years, ultimately resulting in new ways of working that are digital in nature and transformative in impact ...

2020 was the equivalent of a wedding with a top-shelf open bar. As businesses scrambled to adjust to remote work, digital transformation accelerated at breakneck speed. New software categories emerged overnight. Tech stacks ballooned with all sorts of SaaS apps solving ALL the problems — often with little oversight or long-term integration planning, and yes frequently a lot of duplicated functionality ... But now the music's faded. The lights are on. Everyone from the CIO to the CFO is checking the bill. Welcome to the Great SaaS Hangover ...

Regardless of OpenShift being a scalable and flexible software, it can be a pain to monitor since complete visibility into the underlying operations is not guaranteed ... To effectively monitor an OpenShift environment, IT administrators should focus on these five key elements and their associated metrics ...

An overwhelming majority of IT leaders (95%) believe the upcoming wave of AI-powered digital transformation is set to be the most impactful and intensive seen thus far, according to The Science of Productivity: AI, Adoption, And Employee Experience, a new report from Nexthink ...

Overall outage frequency and the general level of reported severity continue to decline, according to the Outage Analysis 2025 from Uptime Institute. However, cyber security incidents are on the rise and often have severe, lasting impacts ...

In March, New Relic published the State of Observability for Media and Entertainment Report to share insights, data, and analysis into the adoption and business value of observability across the media and entertainment industry. Here are six key takeaways from the report ...

Regardless of their scale, business decisions often take time, effort, and a lot of back-and-forth discussion to reach any sort of actionable conclusion ... Any means of streamlining this process and getting from complex problems to optimal solutions more efficiently and reliably is key. How can organizations optimize their decision-making to save time and reduce excess effort from those involved? ...

As enterprises accelerate their cloud adoption strategies, CIOs are routinely exceeding their cloud budgets — a concern that's about to face additional pressure from an unexpected direction: uncertainty over semiconductor tariffs. The CIO Cloud Trends Survey & Report from Azul reveals the extent continued cloud investment despite cost overruns, and how organizations are attempting to bring spending under control ...

Image
Azul

Data Engineers Spend 2 Days Per Week Firefighting Bad Data

Data professionals are spending 40% of their time evaluating or checking data quality and that poor data quality impacts 26% of their companies' revenue, according to The State of Data Quality 2022, a report commissioned by Monte Carlo and conducted by Wakefield Research.

The survey found that 75% of participants take four or more hours to detect a data quality incident and about half said it takes an average of nine hours to resolve the issue once identified. Worse, 58% said the total number of incidents has increased somewhat or greatly over the past year, often as a result of more complex pipelines, bigger data teams, greater volumes of data, and other factors.

Today, the average organization experiences about 61 data-related incidents per month, each of which takes an average of 13 hours to identify and resolve. This adds up to an average of about 793 hours per month, per company.

However, 61 incidents only represents the number of incidents known to respondents.

"In the mid-2010s, organizations were shocked to learn that their data scientists were spending about 60% of their time just getting data ready for analysis," said Barr Moses, Monte Carlo CEO and co-founder. "Now, even with more mature data organizations and advanced stacks, data teams are still wasting 40% of their time troubleshooting data downtime. Not only is this wasting valuable engineering time, but it's also costing precious revenue and diverting attention away from initiatives that move the needle for the business. These results validate that data reliability is one of the biggest and most urgent problems facing today's data and analytics leaders."

Nearly half of respondent organizations measure data quality most often by the number of customer complaints their company receives, highlighting the ad hoc - and reputation damaging - nature of this important element of modern data strategy.

The Cost of Data Downtime

"Garbage in, garbage out" aptly describes the impact data quality has on data analytics and machine learning. If the data is unreliable, so are the insights derived from it.

In fact, on average, respondents said bad data impacts 26% of their revenue. This validates and supplements other industry studies that have uncovered the high cost of bad data. For example, Gartner estimates poor data quality costs organizations an average $12.9 million every year.

Nearly half said business stakeholders are impacted by issues the data team doesn't catch most of the time, or all the time.

In fact, according to the survey, respondents that conducted at least three different types of data tests for distribution, schema, volume, null or freshness anomalies at least once a week suffered fewer data incidents (46) on average than respondents with a less rigorous testing regime (61). However, testing alone was insufficient and stronger testing did not have a significant correlation with reducing the level of impact on revenue or stakeholders.

"Testing helps reduce data incidents, but no human being is capable of anticipating and writing a test for every way data pipelines can break. And if they could, it wouldn't be possible to scale across their always changing environment," said Lior Gavish, Monte Carlo CTO and co-founder. "Machine learning-powered anomaly monitoring and alerting through data observability can help teams close these coverage gaps and save data engineers' time."

Hot Topics

The Latest

IT and line-of-business teams are increasingly aligned in their efforts to close the data gap and drive greater collaboration to alleviate IT bottlenecks and offload growing demands on IT teams, according to The 2025 Automation Benchmark Report: Insights from IT Leaders on Enterprise Automation & the Future of AI-Driven Businesses from Jitterbit ...

A large majority (86%) of data management and AI decision makers cite protecting data privacy as a top concern, with 76% of respondents citing ROI on data privacy and AI initiatives across their organization, according to a new Harris Poll from Collibra ...

According to Gartner, Inc. the following six trends will shape the future of cloud over the next four years, ultimately resulting in new ways of working that are digital in nature and transformative in impact ...

2020 was the equivalent of a wedding with a top-shelf open bar. As businesses scrambled to adjust to remote work, digital transformation accelerated at breakneck speed. New software categories emerged overnight. Tech stacks ballooned with all sorts of SaaS apps solving ALL the problems — often with little oversight or long-term integration planning, and yes frequently a lot of duplicated functionality ... But now the music's faded. The lights are on. Everyone from the CIO to the CFO is checking the bill. Welcome to the Great SaaS Hangover ...

Regardless of OpenShift being a scalable and flexible software, it can be a pain to monitor since complete visibility into the underlying operations is not guaranteed ... To effectively monitor an OpenShift environment, IT administrators should focus on these five key elements and their associated metrics ...

An overwhelming majority of IT leaders (95%) believe the upcoming wave of AI-powered digital transformation is set to be the most impactful and intensive seen thus far, according to The Science of Productivity: AI, Adoption, And Employee Experience, a new report from Nexthink ...

Overall outage frequency and the general level of reported severity continue to decline, according to the Outage Analysis 2025 from Uptime Institute. However, cyber security incidents are on the rise and often have severe, lasting impacts ...

In March, New Relic published the State of Observability for Media and Entertainment Report to share insights, data, and analysis into the adoption and business value of observability across the media and entertainment industry. Here are six key takeaways from the report ...

Regardless of their scale, business decisions often take time, effort, and a lot of back-and-forth discussion to reach any sort of actionable conclusion ... Any means of streamlining this process and getting from complex problems to optimal solutions more efficiently and reliably is key. How can organizations optimize their decision-making to save time and reduce excess effort from those involved? ...

As enterprises accelerate their cloud adoption strategies, CIOs are routinely exceeding their cloud budgets — a concern that's about to face additional pressure from an unexpected direction: uncertainty over semiconductor tariffs. The CIO Cloud Trends Survey & Report from Azul reveals the extent continued cloud investment despite cost overruns, and how organizations are attempting to bring spending under control ...

Image
Azul