Skip to main content

Q&A Part Two: IBM Talks About Predictive Analytics

Pete Goldin
APMdigest

In Part Two of APMdigest's exclusive interview, Matthew Ellis, IBM Vice President of Service Availability and Performance, discusses predictive analytics.

Click here to start with Part One of the Q&A with IBM VP Matthew Ellis.

APM: Why is predictive analytics gaining so much momentum recently, especially with respect to APM?

ME: Analytics is important to all phases of operations. In all areas of business it is axiomatic that more data enables better decisions, and operations and application management are no exceptions.

Just as important, however, is sorting that data to identify the critical context for decision makers to act on, and this is where analytics come in.

IBM is investing in analytics very seriously, and from an operations management perspective, we apply analytics in three categories: Simplify Operations Management, Avoid Business Disruption, and Enable Optimization.

Simplify Operations Management is a class of analytics technology that enables our customers to do the work that they do today more easily. This includes historical analysis of data to recommend and establish dynamic thresholds, and trending of performance and capacity data to identify areas that may become bottlenecks based on historical behavior.

Avoid Business Disruption is the key driver for the predictive analytics component. The goal is early identification of environmental changes that indicate a significant change in the behavior of an application or service, and to bring this information to the attention of the operations management team so that problems can be identified and addressed before they ever impact a customer. We have identified emerging problems days before traditional management tools saw signs of trouble and in some situations, discovered problems in unmonitored resources that were affecting the behavior of critical applications.

Enable Optimization is the ability to mine collected data across multiple dimensions enabling insight and optimization of services and applications by enabling rich insight. It is also known as business analytics.

APM: What specific functionality should an organization look for in predictive analytics technology?

ME: At IBM, we believe there are three key capabilities that any analytics solution must have to provide maximum predictive capability:

1. Algorithms: Multivariate Analytic techniques are critical to identifying emerging problems early, while all metric data is still well within their normal range.

The key to this statistical approach is to monitor the relationships of important related data metrics and raise an exception when the relationships of data change in significant ways. Any single metric displays a wide range of variability during a normal day, increasing and decreasing with changing workloads, and daily, weekly and seasonal behavior.

In general, however, related metrics will follow the same pattern all the time in a healthy system. Successfully identifying these relationships, and accurately determining when these relationships diverge in an important way is key to accurate early identification of problems.

Our algorithms are developed and refined by one of the largest private math departments in the world; the same organization that developed Watson to win at Jeopardy.

2. Scalability: Analytics solutions work better when they have more data upon which to base their conclusions. The IBM analytics solutions directly leverage proven data collection technologies that have been in use for most of a decade and have seen continual refinement. This capability is proven to be able to collect millions of data points per second, and deliver that data to the analytics engine with very low latency offering real-time evaluation of very large data streams. We believe that the data collection technology we are using is the most scalable and high performance in the industry.

3. Breadth of Monitored Resources: One of our design requirements was to deliver an easily extensible mediation capability allowing customers (or our services teams) to connect any data source to our data collection solution in a matter of hours or days.

During our pilot, we have worked with many products from non-IBM vendors and our team has found that almost all data integration work can be done in a very short time without ever requiring a visit to the customer site, saving time and money while maximizing data availability for analysis.

APM: How do you see Predictive Analytics evolving over the next few years?

ME: IBM expects that analytics tools, and the organizations that use them, will evolve rapidly over the next few years. IBM is investing heavily in providing highly scalable, flexible, and robust systems for identifying emerging problems as early as possible.

We expect analytics to evolve along multiple dimensions:

1. Improvements in analytics learning and data exchange with existing application and service discovery, topology, and CMDB data to combine the strengths of traditional IT tools with analytics learning solutions. This will accelerate the statistical learning process and allow the learned relationships to be built back into the visible topology of the environment.

2. Apply analytics solutions to additional IT management domains to include Smarter Infrastructures, improved detection of security problems, asset management and maintenance scheduling and additional problems

3. Further improve feedback and integration of learning technologies, process optimization, and analytics in general with operations processes.

About Matthew Ellis

Matthew Ellis is the Vice President of Development for Tivoli's Service Availability & Performance Management product portfolio with IBM. This product suite enables monitoring and modeling the utilization, performance, capacity and energy-use of distributed, mainframe and virtualized platforms and associated application software. Ellis joined IBM in 2006 through the Micromuse acquisition, where he was the Vice President of Software Development.

Click here to read Part One of the Q&A with IBM VP Matthew Ellis.

Hot Topic
The Latest
The Latest 10

The Latest

In live financial environments, capital markets software cannot pause for rebuilds. New capabilities are introduced as stacked technology layers to meet evolving demands while systems remain active, data keeps moving, and controls stay intact. AI is no exception, and its opportunities are significant: accelerated decision cycles, compressed manual workflows, and more effective operations across complex environments. The constraint isn't the models themselves, but the architectural environments they enter ...

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue ...

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ... 

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...

AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.

Q&A Part Two: IBM Talks About Predictive Analytics

Pete Goldin
APMdigest

In Part Two of APMdigest's exclusive interview, Matthew Ellis, IBM Vice President of Service Availability and Performance, discusses predictive analytics.

Click here to start with Part One of the Q&A with IBM VP Matthew Ellis.

APM: Why is predictive analytics gaining so much momentum recently, especially with respect to APM?

ME: Analytics is important to all phases of operations. In all areas of business it is axiomatic that more data enables better decisions, and operations and application management are no exceptions.

Just as important, however, is sorting that data to identify the critical context for decision makers to act on, and this is where analytics come in.

IBM is investing in analytics very seriously, and from an operations management perspective, we apply analytics in three categories: Simplify Operations Management, Avoid Business Disruption, and Enable Optimization.

Simplify Operations Management is a class of analytics technology that enables our customers to do the work that they do today more easily. This includes historical analysis of data to recommend and establish dynamic thresholds, and trending of performance and capacity data to identify areas that may become bottlenecks based on historical behavior.

Avoid Business Disruption is the key driver for the predictive analytics component. The goal is early identification of environmental changes that indicate a significant change in the behavior of an application or service, and to bring this information to the attention of the operations management team so that problems can be identified and addressed before they ever impact a customer. We have identified emerging problems days before traditional management tools saw signs of trouble and in some situations, discovered problems in unmonitored resources that were affecting the behavior of critical applications.

Enable Optimization is the ability to mine collected data across multiple dimensions enabling insight and optimization of services and applications by enabling rich insight. It is also known as business analytics.

APM: What specific functionality should an organization look for in predictive analytics technology?

ME: At IBM, we believe there are three key capabilities that any analytics solution must have to provide maximum predictive capability:

1. Algorithms: Multivariate Analytic techniques are critical to identifying emerging problems early, while all metric data is still well within their normal range.

The key to this statistical approach is to monitor the relationships of important related data metrics and raise an exception when the relationships of data change in significant ways. Any single metric displays a wide range of variability during a normal day, increasing and decreasing with changing workloads, and daily, weekly and seasonal behavior.

In general, however, related metrics will follow the same pattern all the time in a healthy system. Successfully identifying these relationships, and accurately determining when these relationships diverge in an important way is key to accurate early identification of problems.

Our algorithms are developed and refined by one of the largest private math departments in the world; the same organization that developed Watson to win at Jeopardy.

2. Scalability: Analytics solutions work better when they have more data upon which to base their conclusions. The IBM analytics solutions directly leverage proven data collection technologies that have been in use for most of a decade and have seen continual refinement. This capability is proven to be able to collect millions of data points per second, and deliver that data to the analytics engine with very low latency offering real-time evaluation of very large data streams. We believe that the data collection technology we are using is the most scalable and high performance in the industry.

3. Breadth of Monitored Resources: One of our design requirements was to deliver an easily extensible mediation capability allowing customers (or our services teams) to connect any data source to our data collection solution in a matter of hours or days.

During our pilot, we have worked with many products from non-IBM vendors and our team has found that almost all data integration work can be done in a very short time without ever requiring a visit to the customer site, saving time and money while maximizing data availability for analysis.

APM: How do you see Predictive Analytics evolving over the next few years?

ME: IBM expects that analytics tools, and the organizations that use them, will evolve rapidly over the next few years. IBM is investing heavily in providing highly scalable, flexible, and robust systems for identifying emerging problems as early as possible.

We expect analytics to evolve along multiple dimensions:

1. Improvements in analytics learning and data exchange with existing application and service discovery, topology, and CMDB data to combine the strengths of traditional IT tools with analytics learning solutions. This will accelerate the statistical learning process and allow the learned relationships to be built back into the visible topology of the environment.

2. Apply analytics solutions to additional IT management domains to include Smarter Infrastructures, improved detection of security problems, asset management and maintenance scheduling and additional problems

3. Further improve feedback and integration of learning technologies, process optimization, and analytics in general with operations processes.

About Matthew Ellis

Matthew Ellis is the Vice President of Development for Tivoli's Service Availability & Performance Management product portfolio with IBM. This product suite enables monitoring and modeling the utilization, performance, capacity and energy-use of distributed, mainframe and virtualized platforms and associated application software. Ellis joined IBM in 2006 through the Micromuse acquisition, where he was the Vice President of Software Development.

Click here to read Part One of the Q&A with IBM VP Matthew Ellis.

Hot Topic
The Latest
The Latest 10

The Latest

In live financial environments, capital markets software cannot pause for rebuilds. New capabilities are introduced as stacked technology layers to meet evolving demands while systems remain active, data keeps moving, and controls stay intact. AI is no exception, and its opportunities are significant: accelerated decision cycles, compressed manual workflows, and more effective operations across complex environments. The constraint isn't the models themselves, but the architectural environments they enter ...

Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue ...

In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ... 

Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...

Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...

Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...

The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...

The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...

In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...

AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.