The ability to ensure that business services meet customer needs has never been more critical or more challenging. End-users have increasingly higher expectations, as well as more visibility into failure, thanks to social media and technology adoption.
The Data Analysis Challenge
The IT that supports critical business services has grown tremendously in size and complexity as new technology is adopted to meet changing business needs. Many IT organizations are no longer wholly responsible for all the components that business services rely on and employ third-party services and content providers that reside outside their firewall. In fact, a study of critical business services for 3,000 enterprises shows that the average service depends on data from more than ten different hosts.
Additionally, applications are becoming increasingly dynamic. Outsourced components and services might be interchanged as part of the normal course of a day. Our study shows that over the course of 24 hours, 42 percent of transactions will depend on services emanating from at least 6 data centers, all invoked directly from the client or consumption point. In 8 percent of transactions, services will be delivered from 30 different data centers or more.
Managing business services and their infrastructures is more difficult than ever. Processing is distributed, occurring within the data center in physical, virtual and hybrid environments; in shared third-party environments delivering specialized outsourced components; and on the increasingly more powerful end-user clients. Cloud computing, which promises improved IT efficiency and flexibility as well as simplified service provisioning, also increases IT service complexity.
Traditionally, the approach to business service management has been to leverage a discovery process to populate a configuration management database, which is then used to group various IT components by the business services they support. Data from disparate monitoring tools, typically alert data, is then correlated to help understand how those IT systems support the business service.
However, this approach is fundamentally flawed in modern IT environments. These techniques are not designed to address the constant change that occurs across the entire service delivery chain and are less useful in cases of highly shared infrastructure.
In today’s dynamic IT environments, setting thresholds for the various monitoring points in the infrastructure becomes practically impossible. When thresholds are set manually, they will either be too generous to pick up performance issues, or so stringent resulting in a sea of alerts being fired by the monitoring solutions. A new approach is required to ensure that IT can meet constantly changing business needs.
Bringing Metrics and Business Services Together
Most IT environments have more monitoring data than they know what to do with, but few if any of these metrics can report on what really matters - how the core business services are being supported. Ultimately, stakeholders need to have enough relevant information to be able to take action before the business is impacted. The key is identifying irregular patterns and abnormal behavior of the overall business service or its underlying components.
Relevant metrics should be tied to how business success (or failure) is measured. Examples of measureable business outcomes include the number of impacted users, up-to-the-minute revenue, conversion rates, number of orders, and number of page views.
More importantly, these metrics should not be viewed in isolation. They need to be viewed in the context of all of the more technical IT metrics so that ‘leading indicators’ can be identified – internal conditions and combinations of factors that may lead to a later business impact if not corrected.
Understanding performance and usage patterns and establishing a "normal" behavior pattern or profile is essential in detecting subtle anomalies. Predictive analytics provides insight into which conditions in a highly complex IT environment should be considered normal and acceptable and, in contrast, which events and conditions may lead to service level degradation. It is also vital that these metrics be source agnostic – in that they can be collected from existing monitoring tools and leveraged in the context of end user performance.
“What-if” scenarios can help organizations identify areas where IT resources can be used to address abnormal situations or improve the business service. Predictive analytics capabilities can be made even more powerful by leveraging the aggregate performance data of an entire customer base. This insight, which we call “Collective Intelligence,” can feed real-time health and performance data to a supplier catalog.
This information allows an organization to look beyond its walls by gauging the overall performance of a third-party supplier that it shares with other customers and quickly identify whether the fault lies with the supplier.
These capabilities can be further extended to perform ‘what-if’ scenarios such as:
What if I change my supplier mix?
What if I move IT services to the cloud?
What if I get an unexpected surge in traffic?
Organizations can leverage analytics as well as a supplier catalog to make intelligent decisions on how to optimize the entire application delivery chain. This can include changes to components that are under the enterprise’s control (e.g. improving resources on a particular VM), as well as leveraging the supplier catalog and price/performance comparisons to ensure an optimal solution. For example, the mix of content delivery networks could be adjusted based on factors such as geographic location, traffic volumes, performance and cost of the service.
If organizations truly want to support key business processes with IT services, they need to first understand how these systems support business needs and then optimize the entire service delivery chain to support these business outcomes. An approach that starts with business outcomes and works back to correlate how all the IT metrics relate to meeting that outcome will bring success. It is also no longer good enough to be fast at fixing problems – it is now vital to be able to prevent them as well.
About Imad Mouline
Imad Mouline is Chief Technology Officer (CTO) of Compuware's APM Solution. He is a veteran of software architecture and R&D and a recognized expert in web application architecture, development and performance management. His areas of expertise include Cloud Computing, Software-as-a-Service, and mobile applications. As Compuware's CTO of APM, Mouline leads the expansion of the company's product portfolio and market presence. Imad is a frequent speaker at various user conferences and technology events (e.g., Velocity, All About the Cloud, Interop Las Vegas and Think Tank). He has also participated in executive conferences such as the InfoWorld CTO Forum and serves on the advisory board for the Cloud Connect conference.
The 11th anniversary of the Apple App Store frames a momentous time period in how we interact with each other and the services upon which we have come to rely. Even so, we continue to have our in-app mobile experiences marred by poor performance and instability. Apple has done little to help, and other tools provide little to no visibility and benchmarks on which to prioritize our efforts outside of crashes ...
Confidence in artificial intelligence (AI) and its ability to enhance network operations is high, but only if the issue of bias is tackled. Service providers (68%) are most concerned about the bias impact of "bad or incomplete data sets," since effective AI requires clean, high quality, unbiased data, according to a new survey of communication service providers ...
Every internet connected network needs a visibility platform for traffic monitoring, information security and infrastructure security. To accomplish this, most enterprise networks utilize from four to seven specialized tools on network links in order to monitor, capture and analyze traffic. Connecting tools to live links with TAPs allow network managers to safely see, analyze and protect traffic without compromising network reliability. However, like most networking equipment it's critical that installation and configuration are done properly ...
The Democratic presidential debates are likely to have many people switching back-and-forth between live streams over the coming months. This is going to be especially true in the days before and after each debate, which will mean many office networks are likely to see a greater share of their total capacity going to streaming news services than ever before ...
Monitoring of heating, ventilation and air conditioning (HVAC) infrastructures has become a key concern over the last several years. Modern versions of these systems need continual monitoring to stay energy efficient and deliver satisfactory comfort to building occupants. This is because there are a large number of environmental sensors and motorized control systems within HVAC systems. Proper monitoring helps maintain a consistent temperature to reduce energy and maintenance costs for this type of infrastructure ...
Shoppers won’t wait for retailers, according to a new research report titled, 2019 Retailer Website Performance Evaluation: Are Retail Websites Meeting Shopper Expectations? from Yottaa ...
Customer satisfaction and retention were the top concerns for a majority (58%) of IT leaders when suffering downtime or outages, according to a survey of top IT leaders conducted by AIOps Exchange. The effect of service interruptions on customers outweighed other concerns such as loss of revenue, brand reputation, negative press coverage, or the impact on IT Ops teams.
It is inevitable that employee productivity and the quality of customer experiences suffer as a consequence of the poor performance of O365. The quick detection and rapid resolution of problems associated with O365 are top of mind for any organization to keep its business humming ...
Employees at British businesses rate computer downtime as the most significant irritant at their current workplace (41 percent) when asked to pick their top three ...
The modern enterprise network is an entirely different beast today than the network environments IT and ops teams were tasked with managing just a few years ago. With the rise of SaaS, widespread cloud migration across industries and the trend of enterprise decentralization all playing a part, the challenges IT faces in adapting their management and monitoring techniques continue to mount ...