EMA has just completed research titled, Unifying IT for Digital War Room Performance. The research was partly inspired by current debates about the role of the "War Room" and how it is or is not evolving. Some seem lost in fantasy — "the war room will absolutely disappear." Whereas for others, basic incident handling is just emerging and having a more defined and effective war room team remains a hope for the distant future.
The Industry Debate
As with so much in our industry, a lot of this debate depends on meaning and definition — or in this case how you do or don't define "war room." War rooms are often defined as disastrous assemblages of finger-pointing adults caught up with siloed versions of "the truth" — all at least as interested in proving that their teams are not guilty, as they are in actually solving the problem at hand.
Our goal was to find out how teams are being formed and optimized to handle major incidents and problems that require cross-domain insights
However, for our research we took a much more open-ended approach. Our goal was to find out how teams are being formed and optimized to handle major incidents and problems that require cross-domain insights. This included, by the way, proactive cross-domain teams for managing issues before they become the IT equivalent of life-threatening. Our war rooms could be either physical or virtual. Highly automated or not. Made up of consistent, well-defined teams, or not. But what made them war rooms was the need for collaborative decision making across silos, and the need for urgency in taking effective action.
War Room Processes
Throughout the research, EMA examined the most critical processes logically relevant to war room performance. These included:
Initial awareness — alerting the relevant stakeholders that something is, or about to be, a problem
Response team engagement — making sure relevant stakeholders have an informed context for working together to resolve the problem
Triage and diagnostics — finding out what's really wrong in clear service-impact context
Remediation — actually fixing problem, ideally with inbuilt levels of automation to support the fix
Validation — ensuring that the "fix" really is a fix
Ideally, also, a history has been kept so that IT can move to prevent the problem in the future, or at least bring it to ever speedier resolution. We asked respondents about this in the context of auditing war room performance.
The War Room's Multiple Dimensions
We also looked at cloud to see if public and private cloud initiatives were making things easier or harder in the war room and why. (What we saw is a little bit of both.)
And then there's DevOps and agile. One of the industry hallucinations seems to be that DevOps and agile are making the war room disappear. What we found is just the opposite in the vast majority of cases (well over 80%). We looked, as well, at how development is working as an integrated part of the digital war room phenomenon, and the impact of in-house applications on war room processes.
And then of course there's security. Or maybe security should come first. In fact, security incident and event management (SIEM) was right at the top of digital war room technology priorities along with advanced IT analytics. The growing need to handshake between operations, security and ITSM teams in the digital war room was evident throughout our data.
Looking at all of the above, you might say that incidents and problems are increasingly non-denominational in how they occur. In other words, digital war rooms are no longer (if they ever were) just about operations in a vacuum.
Technologies, Metrics and Success
As mentioned above, analytics and security were the big winners when we looked at digital war room technology priorities. In fact, the top-ranking five were:
1. Advanced IT analytics or AIOps
3. Security threat intelligence analysis
4. Endpoint instrumentation and analytics
5. IT process automation
The top two technical metrics were performance latencies and end user experience management.
And the top three obstacles to digital war room success were security-related issues, inconsistent data, and data fragmentation.
Overall, we saw that the digital war room is becoming more not less important, growing in size, becoming more proactive and fundamentally more strategic.
To get a lot more insight, please watch my on-demand EMA webinar.
Read my next blog, Organization and Process (Or Lack Thereof) in the Digital War Room
Michael Olson on the AI+ITOPS Podcast: "I really see AIOps as being a core requirement for observability because it ... applies intelligence to your telemetry data and your incident data ... to potentially predict problems before they happen."
Enterprise ITOM and ITSM teams have been welcoming of AIOps, believing that it has the potential to deliver great value to them as their IT environments become more distributed, hybrid and complex. Not so with DevOps teams. It's safe to say they've kept AIOps at arm's length, because they don't think it's relevant nor useful for what they do. Instead, to manage the software code they develop and deploy, they've focused on observability ...
The post-pandemic environment has resulted in a major shift on where SREs will be located, with nearly 50% of SREs believing they will be working remotely post COVID-19, as compared to only 19% prior to the pandemic, according to the 2020 SRE Survey Report from Catchpoint and the DevOps Institute ...
All application traffic travels across the network. While application performance management tools can offer insight into how critical applications are functioning, they do not provide visibility into the broader network environment. In order to optimize application performance, you need a few key capabilities. Let's explore three steps that can help NetOps teams better support the critical applications upon which your business depends ...
In Episode 8, Michael Olson, Director of Product Marketing at New Relic, joins the AI+ITOPS Podcast to discuss how AIOps provides real benefits to IT teams ...
Will Cappelli on the AI+ITOPS Podcast: "I'll predict that in 5 years time, APM as we know it will have been completely mutated into an observability plus dynamic analytics capability."
When you consider that the average end-user interacts with at least 8 applications, then think about how important those applications are in the overall success of the business and how often the interface between the application and the hardware needs to be updated, it's a potential minefield for business operations. Any single update could explode in your face at any time ...
Despite the efforts in modernizing and building a robust infrastructure, IT teams routinely deal with the application, database, hardware, or software outages that can last from a few minutes to several days. These types of incidents can cause financial losses to businesses and damage its reputation ...
In Episode 7, Will Cappelli, Field CTO of Moogsoft and Former Gartner Research VP, joins the AI+ITOPS Podcast to discuss the future of APM, AIOps and Observability ...