It is common knowledge that Data is extremely important in Automation and AIOps implementations. Good data yields good insights from the implementation. Yet, while planning for the efforts for an AIOps implementation, the effort needed for some aspects of data are completely overlooked. The most overlooked aspects of the Data are:
■ Data source: Identification of Data source
■ Data volume: availability of the volume of data needed for the implementation
■ Data Access: Getting access to the data
■ Data Storage: Storage possibilities for the data
It is somehow assumed that these are never an issue. In this write up we discuss why these are critical aspects of Data and some guidelines on how to budget for the efforts towards these in Automation and AIOps implementations.
Fig1: Commonly overlooked aspects of Data in AIOps implementation
The Fallacy That Data Source Identification Is Never an Issue
This might seem like we are making a mountain of a molehill, but it is often the reason for many of the Automation and AIOps engagements to get stalled.
Where are we going to get the kind of data we need for the Project to deliver its value?
Are all the data sources really identified?
Which departments in the organization own the data that is needed for the Project. These are some of the fundamental questions on Data source that should get asked at the start of the Automation and AIOps Project. In one of proof of concepts for a Process Mining tool, getting transactional, time-stamped data was the key for the success of the engagement. We were never able to identify the source of such a data, nor identify the data source owner in the organization. As a result, the entire proof of concept had to be stalled.
But how often is it that we factor for efforts to search for the data sources in an Automation and AIOps Project?
The Myth of Tons of Data Buried Waiting to Be Excavated
Good volume of Data is always essential for any Automation and AIOps project as the insights from the data are key to the success of implementation. But it also means, that the organization should have planned for that volume of data, retained the data or should have data practices that support data updation periodically.
Often, this is not the case. If there is no specific regulation or guideline on data retention, the data is simply not available. Now, superimpose this situation with the project being dependent on this volume of data and we have a catastrophe in waiting. In one of the proof of concepts for a Service desk Text analytics platform that I was involved in, the interactions transcripts were a critical input data and this was not readily available.
There was however a workaround to get some part of the data needed. It was a herculean task to have the Service Desk interaction Transcripts specifically inserted from the Text interaction tool in order for it to be usable in the Project.
We could only get a part of the data that we wanted as the remaining data was not available. Getting through all this took time, efforts, and numerous confusing discussions. We had never anticipated this situation and hence had not planned for this effort as part of the AIOps project.
Seek and Ye Shall Get the Access to the Data — Conditions Apply!
Access to data or restriction to it, is a function of the organizational policies, security guidelines and/or the regulations governing the data based on the region it is originating from. Even a simple Incident ticket data is subject to reasonable restrictions of access. Bulk data download access is a luxury in most Automation and AIOps Projects. If the access to download is somehow available, it may be manually cumbersome to get the amount of data needed, and if not, the struggle to identify access providers is there.
It was an irony that in one of the automation Projects, we had to manually download hundreds of data points one by one, while also ensuring the meta data was downloaded appropriately with it. Hence, access to data is not something that can be taken for granted. There has to be elaborate planning on the nature of data, the region from where it is originating, from where it has to be accessed, how it has to be accessed and whether bulk access is available or not. Efforts must be woven into the project to take care of these aspects.
Put the Data in a Secure Place
It matters to customers and their contracts, whether their data is stored in — a private cloud storage or a public cloud or on an on-premises infrastructure. If the customer is from an organization subject to several regulations, it is unlikely that you can get away with storing the Data anywhere even if it is for a simple proof of concept of an Automation tool. Ensure that the Data storage is in line with that mandated by the regulations and in those regions where Data storage is permitted. If storage must be planned on an on-premises infrastructure, aspects such as procurement of the infrastructure, installations, security validations must be ensured. For this, the project should have budgeted the efforts.
Bringing It All Together
It must be clear by now that efforts for these aspects of Data- source, Access, volume and storage cannot be overlooked and must be budgeted for in the Automation and AIOps project plan. One of the best ways to do that is to plan for a stream of tasks called "Data" alongside the other streams in the Project (such as commercial, legal, process, people) and include these aspects of Data as sub streams.
It may be noted that the efforts required for these aspects of data may depend on one or more of the following parameters
■ Location spread of the Data in scope: Higher the location spread, it may be that the Data source owners would be spread too. It may be prudent to factor additional efforts for Data source/owner identification in the plan.
■ Data generated from Legacy or in-house products: Where the data is generated from Legacy tools or from in-house developed tools, it may be good to budget some efforts towards ensuring data access is available and the volume of data needed is available.
■ Data source ownership spread across Vendor/provider: Higher efforts must be budgeted towards discussions with the vendors/providers who own the data sources and for the activities to get access to the data.
■ Regulations that the organization may be subjected to: Higher the regulations, the aspects related to Data storage and Data access will need more focus.
■ Organization's maturity in the automation journey: If the organization is new in their automation journey, identifying the right use cases and building the business case for them would need sufficient efforts budgeted.
The table below is a sample guidance based on my experience of working with Automation and AIOps projects.
Site reliability engineers are development-focused IT professionals who work on developing and implementing solutions that solve reliability, availability, and scale problems. On the other hand, DevOps engineers are ops-focused workers who solve development pipeline problems. While there is a divide between the two professions, both sets of engineers cross the gap regularly, delivering their expertise and opinions to the other side and vice versa ...
Site reliability engineering (SRE) is fast becoming an essential aspect of modern IT operations, particularly in highly scaled, big data environments. As businesses and industries shift to the digital and embrace new IT infrastructures and technologies to remain operational and competitive, the need for a new approach for IT teams to find and manage the balance between launching new systems and features and ensuring these are intuitive, reliable, and friendly for end users has intensified as well ...
The most sophisticated observability practitioners (leaders) are able to cut downtime costs by 90%, from an estimated $23.8 million annually to just $2.5 million, compared to observability beginners, according to the State of Observability 2022 from Splunk in collaboration with the Enterprise Strategy Group. What's more, leaders in observability are more innovative and more successful at achieving digital transformation outcomes and other initiatives ...
Programmatically tracked service level indicators (SLIs) are foundational to every site reliability engineering practice. When engineering teams have programmatic SLIs in place, they lessen the need to manually track performance and incident data. They're also able to reduce manual toil because our DevOps teams define the capabilities and metrics that define their SLI data, which they collect automatically — hence "programmatic" ...
Recently, a regional healthcare organization wanted to retire its legacy monitoring tools and adopt AIOps. The organization asked Windward Consulting to implement an AIOps strategy that would help streamline its outdated and unwieldy IT system management. Our team's AIOps implementation process helped this client and can help others in the industry too. Here's what my team did ...
You've likely heard it before: every business is a digital business. However, some businesses and sectors digitize more quickly than others. Healthcare has traditionally been on the slower side of digital transformation and technology adoption, but that's changing. As healthcare organizations roll out innovations at increasing velocity, they must build a long-term strategy for how they will maintain the uptime of their critical apps and services. And there's only one tool that can ensure this continuous availability in our modern IT ecosystems. AIOps can help IT Operations teams ensure the uptime of critical apps and services ...
Between 2012 to 2015 all of the hyperscalers attempted to use the legacy APM solutions to improve their own visibility. To no avail. The problem was that none of the previous generations of APM solutions could match the scaling demand, nor could they provide interoperability due to their proprietary and exclusive agentry ...
The DevOps journey begins by understanding a team's DevOps flow and identifying precisely what tasks deliver the best return on engineers' time when automated. The rest of this blog will help DevOps team managers by outlining what jobs can — and should be automated ...
A survey from Snow Software polled more than 500 IT leaders to determine the current state of cloud infrastructure. Nearly half of the IT leaders who responded agreed that cloud was critical to operations during the pandemic with the majority deploying a hybrid cloud strategy consisting of both public and private clouds. Unsurprisingly, over the last 12 months, the majority of respondents had increased overall cloud spend — a substantial increase over the 2020 findings ...
As we all know, the drastic changes in the world have caused the workforce to take a hybrid approach over the last two years. A lot of that time, being fully remote. With the back and forth between home and office, employees need ways to stay productive and access useful information necessary to complete their daily work. The ability to obtain a holistic view of data relevant to the user and get answers to topics, no matter the worker's location, is crucial for a successful and efficient hybrid working environment ...