Virtualization has answered the call for controlling the rising tide of IT infrastructure costs, allowing IT to make changes and deploy new applications or updates more easily and quickly than ever before. Today hardware savings and more efficient management of pre-production and production environments are driving virtualization’s applicability, making virtualization more pronounced and widely implemented.
Yet the benefits of pervasive virtualization and tiered virtual infrastructure come with some harsh realities. With the number of virtual servers compared to physical servers exploding, today's IT infrastructures are increasingly complex, costly and cumbersome to manage. It seems everyone is putting everything in its own server. While this saves the cost of hardware for each of these servers, it can impede cost savings of managing all of these VMs, which is a whole new effort spawned by virtualization. In every virtualization implementation, you still need to manage, monitor and make configuration changes to the physical environment, virtualization layer and content of the guest systems.
Poorly managed virtual environments can put the performance and availability of business services at risk, particularly in the area of change management and configuration management.
The following are 4 major challenges to virtualization:
1. Complexity due to added virtualization layer
Anyone who works in the “virtual datacenter” realizes the complexity of virtual environments. Not only do they include all the traditional complexities familiar to managing physical data centers, but they have the added dimensions of mobility, increased volume of servers and a wider mix of configurations.
For virtualization implementations, the number of Virtual Machines (VMs) is expected to grow rapidly, from implementations typically around 20 today to 100 per physical server by 2016. Given this growth, having multiple VMs and a hypervisor environment that provides VM functions all on one physical server makes the overall operating environment more complex, and much more difficult to identify the cause of problems when performance drops or when there is an availability issue. IT operations is confronted with the challenge of where to look for problems, is it in the physical infrastructure, or in the virtual infrastructure or in the environment running inside VMs?
Virtualization technology adds more elements to the environment stack. When environment incidents occur, incident root cause analysis needs to spend more time and effort penetrating these layers.
Due to the fact that VMs can be switched in and out of operation or transitioned between physical servers, it is challenging to accurately monitor and analyze performance. Issues also come up within a guest system that IT teams just see as a black box, making identification a challenging bottleneck.
2. Limited VM Content Visibility
For IT infrastructure management, lack of visibility is a cause for concern, undermining the effectiveness of understanding IT services performance. Performance issues can generally arise due to the combination of ineffective or incorrect handling of guest system configuration and changes or failures in the underlying virtual infrastructure. Abundant amounts of environmental information hide the real issues. Since, virtual machines encapsulate application and infrastructure configurations, rendering visibility of virtualized infrastructure into a hard-to-penetrate 'black box', IT teams find their efforts complicated in identifying, tracking and validating configuration and changes for each virtual machine.
In many deployments, the lack of visibility into virtual data center performance may not become apparent until it’s too late – for example, when a major performance problem occurs with a mission-critical business application. It often takes a crisis before IT departments realize the consequence of the lack of visibility. It is impossible to detect and investigate problems, and take corrective action and to resolve issues without required visibility into virtual infrastructure. As a result incidents become more difficult to diagnose and resolve, making environment drift inevitable.
With VMs deployed through virtualized image management, managers rely on a library of standardized images. However, standardization can only realistically be achieved by careful management of the library of virtual images based on a thorough understanding of the content of each virtual image.
While a common image for operating systems is still easy to provision, relying on the image library becomes more complicated and less dependable going up to provisioning more complicated entities like application servers and such.
Depersonalizing images and then automatically adding the location- and environment-specific information during deployments, virtual images are portable across various configurations in the enterprise. This exposes image management to many more costly challenges. By not knowing how an image is configured to work with other images, IT can miss key information that could help them understand the interrelationships between the various pieces in heterogeneous setups, leaving the image to drift off over time.
This is evident in critical areas of IT management:
- Consistency Management - With setup carried out from an image, many believe that there can’t be any inconsistency issues. Yet without an understanding of image content, virtualization can create gaps in consistency, and without a means for automatically identifying such gaps, consistency management is further challenged.
- Continuity Management - Automated configuration and deployment solutions make it easy for IT and business users to create servers on the fly. However the downside is increased continuity challenges. IT organizations need to bridge the gap between continuity requirements and the underlying virtual infrastructure. Virtualization has driven up the rate of change in the data center. Even the disaster recovery image may not be reliable, since by taking a straight copy of the environment the DR manager doesn’t understand and know which changes actually made it into the DR environment.
- Release Management - Virtualization may give IT confidence in managing releases, that they just need to check the image in a performance testing environment. Nevertheless problems still turn up in production. Since in virtualization the released image is not replaced at once, but incrementally released. An inherent disparity occurs between the virtual image in testing and what was deployed to production, meaning the established virtual images can no longer be relied on in case of recovery. Release Validation is critical for keeping virtual environments performing by comparing and verifying what had already been checked and changed, identifying any changes that occurred after release.
Understanding and keeping up with changes will help organizations understand how to best take advantage of virtualization’s full capabilities, as well as the potential limitations.
3. Limited Business and Software Infrastructure Perspective
Most existing tools that manage virtualized environments focus on infrastructure management at the level of virtual machines and down, monitoring their availability, performance and configuration.
Move a VM from one physical server to another, and network port profile, VLANs, security settings, etc., have to be reconfigured. Many IT organizations haven’t taken the critical step of giving complete visibility and control of the VM lifecycle from a business service/application perspective.
Risk grows by not evaluating the state and status of business applications and software infrastructure that are distributed between numerous virtual machines or even evaluating them in a separate toolset. Disconnect between the two points of view – virtualization management and business service management can create misalignment, introducing potential issues and complicating management processes.
4. Heterogeneous World - Physical + Virtual + Cloud
Many critical applications still have key components running on physical servers for one reason or another while some of the components are already moved to the cloud, making heterogeneous data centers more common and a more widely accepted practice among IT pros. Management of heterogeneous setup needs to take into account differences in operations processes across physical, virtual and cloud environments and differences in controls and responsibilities over these environments. Frequently different teams will administer, maintain and update components within the data center and cloud. Different sets of tools are applied and even changes are processed and deployed differently.
Considering such diversity there needs to be a single view into the state of the business systems and their infrastructure across the heterogeneous environments. A cloud team should be aware of changes made in the physical data center. And a business service manager should be aware of changes made in a virtualized environment hosting some of their service components.
Lack of transparency across the heterogeneous environments can lead to misalignment of different types of business system components, causing issues that will be difficult to isolate and effort-intensive to resolve.
The Promise of Virtualization
Transition to a virtualized environment holds great promise for IT. With applications hosted in virtualized data centers, organizations can save the capital expense of updating hardware, cut maintenance costs, use less power, and even free up floor space.
But as we have outlined here, there are a number of barriers to fully adopting virtualization, and enjoying the rewards. It is well worth addressing how to properly take on these hurdles when 'going virtual', otherwise down the line you may face hard questions about when all the promised savings will arrive, and why business systems availability and performance issues still persist.
Managing a complex virtualized environment is a tough job. Good management tools that automate and control activities in the virtual layer are essential for successful virtualization. However without full visibility across the entire business application stack it is difficult to deliver the high quality service that business customers expect and demand.
ABOUT Sasha Gilenson
Sasha Gilenson is CEO for Evolven Software, the innovator in IT Operations Analytics. Prior to Evolven, Sasha spent 13 years at Mercury Interactive, participating in the establishing of Mercury’s SaaS and BTO strategy. He studied at the London Business School and has more than 15 years of experience in IT operations. You can reach him on LinkedInor follow his tweets at: @sgilenson
A growing need for process automation as a result of the confluence of digital transformation initiatives with the remote/hybrid work policies brought on by the pandemic was uncovered by an independent survey of over 500 IT Operations, DevOps, and Site Reliability Engineering (SRE) professionals commissioned by Transposit for its inaugural State of DevOps Automation Report ...
As the Covid-19 pandemic forces a global reset of how we gather and work, 60% of organizations are looking forward to increased spending in 2021 to deploy new technologies, according to the 14th annual State of the Network global study of enterprise networking and security challenges released by VIAVI Solutions ...
Complexity breaks correlation. Intelligence brings cohesion. This simple principle is what makes real-time asset intelligence a must-have for AIOps that is meant to diffuse complexity. To further create a context for the user, it is critical to understand service dependencies and correlate alerts across the stack to resolve incidents ...
We're all familiar with the process of QA within the software development cycle. Developers build a product and send it to QA engineers, who test and bless it before pushing it into the world. After release, a different team of SREs with their own toolset then monitor for issues and bugs. Now, a new level of customer expectations for speed and reliability have pushed businesses further toward delivering rapid product iterations and innovations to keep up with customer demands. This leaves little time to run the traditional development process ...
On Wednesday January 27, 2021, Microsoft Office 365 experienced an outage affected a number of its services with a prolonged outage affecting Exchange Online. Despite Microsoft indicating that it was just Exchange Online affected during this outage, some monitoring tools detected that Azure Active Directory and dependent services like SharePoint and OneDrive were also affected at the time. The outage information indicated a rollout and rollback but we wouldn't expect to see such a widescale outage and slowdown just affecting some of the schema unless everything had to be taken offline ...