Skip to main content

2025 DataOps Predictions - Part 1

As part of APMdigest's 2025 Predictions Series, industry experts offer predictions on how DataOps and related technologies will evolve and impact business in 2025.

2025: REAL-TIME DATA IS KEY FOR AI

Real-time data will be a key differentiator for competitive advantage: Industries will increasingly rely on real-time or near real-time data to maintain a competitive edge. Companies that can integrate up-to-date data into their AI systems will provide superior customer experiences with fewer issues and more personalized solutions. The ability to capture and analyze data in real-time will separate industry leaders from those who struggle to modernize their data infrastructure.
Ayman Sayed
CEO, BMC Software

Enterprises Will Augment GenAI with Real-Time Data: The true value of GenAI is realized when integrated into enterprise applications at scale. While enterprises have been cautious with trial deployments, 2025 will be a turning point as they begin to scale GenAI across critical systems like customer support, supply chain, manufacturing, and finance. This will require tools to manage data and track GenAI models, ensuring visibility into data usage. GenAI must be supplemented with specific real-time data, such as vectors and graphs, to maximize effectiveness. In 2025, leading vendors will begin rolling out applications that leverage these advancements.
Lenley Hensarling
Technical Advisor, Aerospike

MULTIMODAL DATA

Multimodal data will be very big, extracting corporate value: Back in 2004, Tim O'Reilly coined the phrase, "Data is the Intel Inside." We don't think quite as much about Intel these days, but Tim was absolutely right about data. We became obsessed with data. We've been talking about data science, being data-driven, and building data-driven organizations ever since. Artificial Intelligence is the current expression of the importance of data.

One problem with being data-driven is that most of any organization's data is locked up in ways that aren't useful. Being data-driven works well if you have nicely structured data in a database. Most companies have that, but they're also sitting on a mountain of unstructured data: PDF files, videos, meeting recordings, real-time data feeds, and more. They aren't even used to thinking of this as data; it's not amenable to SQL and database-centric "business intelligence."

That will change in 2025. It will change because AI will give us the ability to unlock this data as well as the ability to analyze it. It will be able to give structure to the information in PDFs, in videos, in meeting transcripts, and in raw data coming in from sensors. In his Generative AI in the Real World interview, Robert Nishihara asked us to think of the video generated by an autonomous vehicle. Most of that is of limited value — but every now and then, there's a traffic situation that is extremely valuable. Humans aren't going to watch hours of video to extract the value; that's a job for AI. Multimodal AI will help companies to unlock the value of data like this. We're at the start of a new generation of tools for data acquisition, cleaning, and curation that will make this unstructured data accessible.
Laura Baldwin
President, O'Reilly Media

AI DRIVES NEW FOCUS ON DATA QUALITY

AI will renew the focus on data quality, for two reasons: First, high quality data is required for training and fine-tuning models. Second, AI-powered analytics tools will offer a higher-resolution view of data, revealing previously undetected quality issues.
Ryan Janssen
CEO, Zenlytic

Enterprises that ready their data for AI will pull ahead competitively: In 2025, companies will focus on building an organized, high-quality data ecosystem to maximize AI's effectiveness and to pull ahead of their competition. This includes managing metadata through structured data catalogs, ensuring data accuracy with rigorous cleansing and validation, and establishing robust governance practices to safeguard data privacy and security. By implementing clear, ethical guidelines, organizations will create a trustworthy AI framework, empowering data scientists with easy access to reliable data for generating precise, impactful insights across business functions. Enterprises that do this will be hard to compete with. 
Scott Voigt
CEO and Founder, Fullstory

AI DRIVES DATA PIPELINE AUTOMATION

GenAI and as-code first technologies drive data pipeline automation: The ubiquitous use of Kubernetes has led to a configuration-first experience in defining data pipelines. It's as simple as selecting a container image and adding configuration. We'll increasingly see GenAI, trained on processing and execution engines generating this configuration and deploying pipelines automatically through just natural language prompts. Traditional visual ETL tooling, even low code platforms are now at risk of disruption. What a power user could do in a few days (remember you still need to learn these platforms), GenAI does in seconds, spitting out configuration for real-time pipelines. This leads to the question. What is the wider future of any UX if my interface is a prompt? Just view data results and metrics? Engineers may as well be going back to a command line!
Andrew Stevenson 
CTO, Lenses.io

AI-ENHANCED DATA MANAGEMENT AND GOVERNANCE

AI is changing how companies manage and govern their data. Organizations now use data lakehouses to support data scientists and AI engineers working with large language models (LLMs). These lakehouses simplify data access, helping teams avoid juggling multiple storage systems. AI is also helping to automate manual processes like data cleaning and reconciliation—a pain point for many professionals. As AI continues to scale, automated governance will allow companies to manage data more effectively with less manual work.
Emmanuel Darras
CEO and Co-Founder, Kestra

UNIFIED DATA ACCESS AND FEDERATION

A unified approach to data access is high on the agenda for enterprises that plan to consolidate analytics data into a single, accessible source. Data lakehouses support this by providing federated access, allowing teams across the organization to tap into the same data without duplicating it. This approach is expected to drive cross-functional analytics and reduce latency, making it easier for teams to work together on the same shared data.
Emmanuel Darras
CEO and Co-Founder, Kestra

TRUST IN DATA

Establishing trust in data will become the top priority for leaders: In the AI era, data is no longer just a byproduct of operations; it's the foundation for resilience and innovation. Without a strong trust in the data that organizations have and use, businesses will continue to struggle to make informed decisions or leverage emerging technologies like AI. Building this trust will go beyond technology and require leaders to boost data literacy and choose a data strategy that emphasizes both capability and quality. 
Daniel Yu
SVP, SAP Data and Analytics

DATA LABELING

Microscopic lens on the source of data labeling: In technical circles, there are constant discussions around how to get the right dataset — and in turn, how to label that dataset. The reality is that this labeling is outsourced on a global scale. In many cases, it's happening internationally, and often in developing countries, with questionable conditions and levels of pay. You may have task-based workers assessing hundreds of thousands of images and being paid for the number accurately sorted. While AI engineers may be highly in demand and paid well above the market rate, there are questions about this subeconomy.
Gordon Van Huizen
SVP of Strategy, Mendix

EXTENSIVE DATA SETS

Retaining Extensive Data Sets Will Become Essential: GenAI depends on a wide range of structured, unstructured, internal, and external data. Its potential relies on a strong data ecosystem that supports training, fine-tuning, and Retrieval-Augmented Generation (RAG). For industry-specific models, organizations must retain large volumes of data over time. As the world changes, relevant data becomes apparent only in hindsight, revealing inefficiencies and opportunities. By retaining historical data and integrating it with real-time insights, businesses can turn AI from an experimental tool into a strategic asset, driving tangible value across the organization.
Lenley Hensarling
Technical Advisor, Aerospike

SMALL DATA

The past few years have seen a rise in data volumes, but 2025 will bring the focus from "big data" to "small data." We're already seeing this mindset shift with large language models giving way to small language models. Organizations are realizing they don't need to bring all their data to solve a problem or complete an initiative — they need to bring the right data. The overwhelming abundance of data, often referred to as the "data swamp," has made it harder to extract meaningful insights. By focusing on more targeted, higher-quality data — or the "data pond" — organizations can ensure data trust and precision. This shift towards smaller, more relevant data will help speed up analysis timelines, get more people using data, and drive greater ROI from data investments.
Francois Ajenstat
Chief Product Officer, Amplitude

Go to: 2025 DataOps Predictions - Part 2

Hot Topics

The Latest

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

2025 DataOps Predictions - Part 1

As part of APMdigest's 2025 Predictions Series, industry experts offer predictions on how DataOps and related technologies will evolve and impact business in 2025.

2025: REAL-TIME DATA IS KEY FOR AI

Real-time data will be a key differentiator for competitive advantage: Industries will increasingly rely on real-time or near real-time data to maintain a competitive edge. Companies that can integrate up-to-date data into their AI systems will provide superior customer experiences with fewer issues and more personalized solutions. The ability to capture and analyze data in real-time will separate industry leaders from those who struggle to modernize their data infrastructure.
Ayman Sayed
CEO, BMC Software

Enterprises Will Augment GenAI with Real-Time Data: The true value of GenAI is realized when integrated into enterprise applications at scale. While enterprises have been cautious with trial deployments, 2025 will be a turning point as they begin to scale GenAI across critical systems like customer support, supply chain, manufacturing, and finance. This will require tools to manage data and track GenAI models, ensuring visibility into data usage. GenAI must be supplemented with specific real-time data, such as vectors and graphs, to maximize effectiveness. In 2025, leading vendors will begin rolling out applications that leverage these advancements.
Lenley Hensarling
Technical Advisor, Aerospike

MULTIMODAL DATA

Multimodal data will be very big, extracting corporate value: Back in 2004, Tim O'Reilly coined the phrase, "Data is the Intel Inside." We don't think quite as much about Intel these days, but Tim was absolutely right about data. We became obsessed with data. We've been talking about data science, being data-driven, and building data-driven organizations ever since. Artificial Intelligence is the current expression of the importance of data.

One problem with being data-driven is that most of any organization's data is locked up in ways that aren't useful. Being data-driven works well if you have nicely structured data in a database. Most companies have that, but they're also sitting on a mountain of unstructured data: PDF files, videos, meeting recordings, real-time data feeds, and more. They aren't even used to thinking of this as data; it's not amenable to SQL and database-centric "business intelligence."

That will change in 2025. It will change because AI will give us the ability to unlock this data as well as the ability to analyze it. It will be able to give structure to the information in PDFs, in videos, in meeting transcripts, and in raw data coming in from sensors. In his Generative AI in the Real World interview, Robert Nishihara asked us to think of the video generated by an autonomous vehicle. Most of that is of limited value — but every now and then, there's a traffic situation that is extremely valuable. Humans aren't going to watch hours of video to extract the value; that's a job for AI. Multimodal AI will help companies to unlock the value of data like this. We're at the start of a new generation of tools for data acquisition, cleaning, and curation that will make this unstructured data accessible.
Laura Baldwin
President, O'Reilly Media

AI DRIVES NEW FOCUS ON DATA QUALITY

AI will renew the focus on data quality, for two reasons: First, high quality data is required for training and fine-tuning models. Second, AI-powered analytics tools will offer a higher-resolution view of data, revealing previously undetected quality issues.
Ryan Janssen
CEO, Zenlytic

Enterprises that ready their data for AI will pull ahead competitively: In 2025, companies will focus on building an organized, high-quality data ecosystem to maximize AI's effectiveness and to pull ahead of their competition. This includes managing metadata through structured data catalogs, ensuring data accuracy with rigorous cleansing and validation, and establishing robust governance practices to safeguard data privacy and security. By implementing clear, ethical guidelines, organizations will create a trustworthy AI framework, empowering data scientists with easy access to reliable data for generating precise, impactful insights across business functions. Enterprises that do this will be hard to compete with. 
Scott Voigt
CEO and Founder, Fullstory

AI DRIVES DATA PIPELINE AUTOMATION

GenAI and as-code first technologies drive data pipeline automation: The ubiquitous use of Kubernetes has led to a configuration-first experience in defining data pipelines. It's as simple as selecting a container image and adding configuration. We'll increasingly see GenAI, trained on processing and execution engines generating this configuration and deploying pipelines automatically through just natural language prompts. Traditional visual ETL tooling, even low code platforms are now at risk of disruption. What a power user could do in a few days (remember you still need to learn these platforms), GenAI does in seconds, spitting out configuration for real-time pipelines. This leads to the question. What is the wider future of any UX if my interface is a prompt? Just view data results and metrics? Engineers may as well be going back to a command line!
Andrew Stevenson 
CTO, Lenses.io

AI-ENHANCED DATA MANAGEMENT AND GOVERNANCE

AI is changing how companies manage and govern their data. Organizations now use data lakehouses to support data scientists and AI engineers working with large language models (LLMs). These lakehouses simplify data access, helping teams avoid juggling multiple storage systems. AI is also helping to automate manual processes like data cleaning and reconciliation—a pain point for many professionals. As AI continues to scale, automated governance will allow companies to manage data more effectively with less manual work.
Emmanuel Darras
CEO and Co-Founder, Kestra

UNIFIED DATA ACCESS AND FEDERATION

A unified approach to data access is high on the agenda for enterprises that plan to consolidate analytics data into a single, accessible source. Data lakehouses support this by providing federated access, allowing teams across the organization to tap into the same data without duplicating it. This approach is expected to drive cross-functional analytics and reduce latency, making it easier for teams to work together on the same shared data.
Emmanuel Darras
CEO and Co-Founder, Kestra

TRUST IN DATA

Establishing trust in data will become the top priority for leaders: In the AI era, data is no longer just a byproduct of operations; it's the foundation for resilience and innovation. Without a strong trust in the data that organizations have and use, businesses will continue to struggle to make informed decisions or leverage emerging technologies like AI. Building this trust will go beyond technology and require leaders to boost data literacy and choose a data strategy that emphasizes both capability and quality. 
Daniel Yu
SVP, SAP Data and Analytics

DATA LABELING

Microscopic lens on the source of data labeling: In technical circles, there are constant discussions around how to get the right dataset — and in turn, how to label that dataset. The reality is that this labeling is outsourced on a global scale. In many cases, it's happening internationally, and often in developing countries, with questionable conditions and levels of pay. You may have task-based workers assessing hundreds of thousands of images and being paid for the number accurately sorted. While AI engineers may be highly in demand and paid well above the market rate, there are questions about this subeconomy.
Gordon Van Huizen
SVP of Strategy, Mendix

EXTENSIVE DATA SETS

Retaining Extensive Data Sets Will Become Essential: GenAI depends on a wide range of structured, unstructured, internal, and external data. Its potential relies on a strong data ecosystem that supports training, fine-tuning, and Retrieval-Augmented Generation (RAG). For industry-specific models, organizations must retain large volumes of data over time. As the world changes, relevant data becomes apparent only in hindsight, revealing inefficiencies and opportunities. By retaining historical data and integrating it with real-time insights, businesses can turn AI from an experimental tool into a strategic asset, driving tangible value across the organization.
Lenley Hensarling
Technical Advisor, Aerospike

SMALL DATA

The past few years have seen a rise in data volumes, but 2025 will bring the focus from "big data" to "small data." We're already seeing this mindset shift with large language models giving way to small language models. Organizations are realizing they don't need to bring all their data to solve a problem or complete an initiative — they need to bring the right data. The overwhelming abundance of data, often referred to as the "data swamp," has made it harder to extract meaningful insights. By focusing on more targeted, higher-quality data — or the "data pond" — organizations can ensure data trust and precision. This shift towards smaller, more relevant data will help speed up analysis timelines, get more people using data, and drive greater ROI from data investments.
Francois Ajenstat
Chief Product Officer, Amplitude

Go to: 2025 DataOps Predictions - Part 2

Hot Topics

The Latest

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...