Skip to main content

xMatters Releases Platform Advancements

xMatters announced new adaptive incident management feature advancements that provide increased automation across each stage of the incident management lifecycle – diagnosis and collaboration, resolution and post-incident learning.

New features give DevOps and SRE teams the ability to collaborate across the enterprise, streamlining and automating issue resolution. Technology teams can also benefit from continuous improvement throughout the incident management lifecycle to prevent incident recurrences – leaving more time for product innovation.

“Incident response automation and incident management is quickly evolving to keep pace with digital operations teams. Enterprises need a better way to assess, prevent, respond, solve and learn from technical issues and interruptions,” said Troy McAlpin, xMatters CEO. “Easy-to-build and use automation is a precedent step toward an adaptive approach to incident management. Continuous improvement of traditional ITIL or ITSM practices toward SRE, data-driven and automated approaches are much more effective. If you want to deliver an “always on” customer experience you first have to deliver automation with continuous learning to avoid, prevent and resolve impacts.”

Adaptive incident management solves the challenge of responding to service interruptions across teams, cultures and systems by allowing teams to scale up and down based on changing conditions to efficiently manage incidents of all sizes, impacts and severities. New xMatters platform advancements further enable this adaptive approach by applying increased automation to the phases within the incident management lifecycle.

The following innovations address incident management challenges that technology teams face:

- Diagnosis and Collaboration: New ChatOps integrations offer time-saving ways to engage with incidents and collaborate across teams. With the app for Slack and the new xMatters capabilities in Microsoft Teams, DevOps and SRE teams can remain in their app of choice when declaring an incident and also access incident details for easier collaboration and seamless incident resolution. For on-the-go teams, Android and iOS mobile apps make it possible to manage incidents regardless of location.

- Resolve: To facilitate automated incident responses, reduced downtime and improved overall customer experience, new xMatters Incident Resolution Templates and walk-through guides pre-package workflows for easy customization. Additionally, the new merge step functionality within xMatters Flow Designer reduces the effort and complexity of creating workflows. Teams can now reuse parts of their workflows without losing the valuable context that each unique path provides.

- Post-incident Learning: New purpose-built processes guide teams to create an informed postmortem and thorough record of an incident. With the single press of a button, the new Post–incident Report Builder automatically allows teams to review the incident’s impact and timeline; the root cause and contributing factors; as well as actions taken to mitigate and resolve. Teams can also document and assign follow-up tasks, and create reports that drive learning, continuous improvement and the prevention of recurrences.

These advancements are powered by the architecture of the xMatters platform. This includes Flow Designer for workflow automation; application agnostic integrations that can be used to build powerful toolchains; robust data capture capabilities that allow teams to create comprehensive post-incident reports; sophisticated on-call management, reporting and group definitions; and configurable dashboards for visual incident and team performance tracking.

The Latest

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

Many organizations assumed their infrastructure strategy was settled. It had been implemented, optimized and built into long-term plans. Recent changes in technology and vendor consolidation are forcing a second look. Cloud outages and licensing changes have exposed how much dependency exists on a small number of platforms. As a result, organizations are reevaluating whether those decisions still hold up under current conditions ...

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...

xMatters Releases Platform Advancements

xMatters announced new adaptive incident management feature advancements that provide increased automation across each stage of the incident management lifecycle – diagnosis and collaboration, resolution and post-incident learning.

New features give DevOps and SRE teams the ability to collaborate across the enterprise, streamlining and automating issue resolution. Technology teams can also benefit from continuous improvement throughout the incident management lifecycle to prevent incident recurrences – leaving more time for product innovation.

“Incident response automation and incident management is quickly evolving to keep pace with digital operations teams. Enterprises need a better way to assess, prevent, respond, solve and learn from technical issues and interruptions,” said Troy McAlpin, xMatters CEO. “Easy-to-build and use automation is a precedent step toward an adaptive approach to incident management. Continuous improvement of traditional ITIL or ITSM practices toward SRE, data-driven and automated approaches are much more effective. If you want to deliver an “always on” customer experience you first have to deliver automation with continuous learning to avoid, prevent and resolve impacts.”

Adaptive incident management solves the challenge of responding to service interruptions across teams, cultures and systems by allowing teams to scale up and down based on changing conditions to efficiently manage incidents of all sizes, impacts and severities. New xMatters platform advancements further enable this adaptive approach by applying increased automation to the phases within the incident management lifecycle.

The following innovations address incident management challenges that technology teams face:

- Diagnosis and Collaboration: New ChatOps integrations offer time-saving ways to engage with incidents and collaborate across teams. With the app for Slack and the new xMatters capabilities in Microsoft Teams, DevOps and SRE teams can remain in their app of choice when declaring an incident and also access incident details for easier collaboration and seamless incident resolution. For on-the-go teams, Android and iOS mobile apps make it possible to manage incidents regardless of location.

- Resolve: To facilitate automated incident responses, reduced downtime and improved overall customer experience, new xMatters Incident Resolution Templates and walk-through guides pre-package workflows for easy customization. Additionally, the new merge step functionality within xMatters Flow Designer reduces the effort and complexity of creating workflows. Teams can now reuse parts of their workflows without losing the valuable context that each unique path provides.

- Post-incident Learning: New purpose-built processes guide teams to create an informed postmortem and thorough record of an incident. With the single press of a button, the new Post–incident Report Builder automatically allows teams to review the incident’s impact and timeline; the root cause and contributing factors; as well as actions taken to mitigate and resolve. Teams can also document and assign follow-up tasks, and create reports that drive learning, continuous improvement and the prevention of recurrences.

These advancements are powered by the architecture of the xMatters platform. This includes Flow Designer for workflow automation; application agnostic integrations that can be used to build powerful toolchains; robust data capture capabilities that allow teams to create comprehensive post-incident reports; sophisticated on-call management, reporting and group definitions; and configurable dashboards for visual incident and team performance tracking.

The Latest

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

Many organizations assumed their infrastructure strategy was settled. It had been implemented, optimized and built into long-term plans. Recent changes in technology and vendor consolidation are forcing a second look. Cloud outages and licensing changes have exposed how much dependency exists on a small number of platforms. As a result, organizations are reevaluating whether those decisions still hold up under current conditions ...

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...