“I think everything we knew about BSM before, in the traditional sense, is changing dramatically,” says Olivier Thierry, CMO of Zenoss. “Virtualization changes everything. The landscape of tools and processes is now in a complete state of flux.”
Just when companies started to get a handle on Business Service Management (BSM), virtualization forces everyone to look at availability and performance in a whole new way. It has become clear that virtualization is not hype – it is a practical and now essential factor in IT that will only become more important as time goes on. This means that we all have to learn to live in the virtual world.
“Virtualization is a double-edged sword,” notes Troy DuMoulin, ITIL Service Manager, AVP Product Strategy, Pink Elephant. “It produces cost savings and reduces physical management, but it also increases complexity.”
Too Many Tools, Too Little Time
While virtualization automates some processes, with technologies like VMotion, the IT operations team faces several new challenges in the virtual environment. First, the complexity of virtualization can produce the need for an entirely new layer of tooling – a totally different set of tools from the ones used to manage the traditional physical environment.
“So now I am practicing swivel chair management,” Thierry says. “I have eyeballs on the old tools and eyeballs on the new tools, and they are not necessarily integrated. They are different sets of tools, which means that I have specialists, a new silo growing for virtualization management.”
“That is a big hurdle,” Thierry continues. “After working so hard to de-silo the organization and give it a direct line of sight to the customer, from a service perspective, the last thing you want to do is create silos again. But in effect you have yet another silo with virtualization.”
“We suffer from too-many-tools-syndrome,” he adds. “The problem is the user can’t see the forest for the trees. They have to sift through too much the data. For example, if you had a UCS box with 480 blades and you had 15 VMs per blade, just think about how much data that one box would generate. It’s just too much data.”
According to Thierry, the real trick is to point the user to the right data. It is very easy to collect mountains of data from across the enterprise, but presenting that data in a relevant, meaningful way at the right moment in time for the right problem in context is a challenge for many of today’s tools. Virtualization makes the problem even worse.
VM sprawl is another common challenge that many IT admins are already familiar with.
“As soon as you remove the physical obstacle from the provisioning equation, as soon as you make it easy for people to spin up what used to be complete physical systems, as soon as you make it as easy as clicking a button on VMware Lab Manager, you have greatly accelerated the pace at which new servers are going to be put into service,” says Javier Soltero, Chief Technology Officer for Management Products for SpringSource, a division of VMware.
“You end up creating a lot of VMs,” Thierry agrees. “So you end up creating a whole bunch of new servers - some of which you lose track of, you forget that they exist – and this eats up a lot of disk space and creates a lot of network stress.”
All of these new servers have to be monitored and all the associated problems must be solved.
Where is My Application Right Now?
Another challenge companies face is figuring out where everything is in this dynamic new virtual world.
“Virtualization has created a challenge to manage applications,” says Jack Probst, Principal Consultant, Pink Elephant. “In times past, if I wanted to know where a specific application was running, my tools would tell me it is on a specific server. Today, that application could be in a VM that spans multiple physical devices, and the VM may move.”
One of the challenges that Probst hears from organizations, as they start to move into the virtual space, is having access to the right tools that give them insight into where the VM is at the moment.
“We get customers that say: I am running the application that I used to run but I can’t figure out what server it is running on because things are moving with VMotion,” Thierry says. “I am having trouble defining and delivering service to my constituents.”
“Virtualization makes end-user monitoring more complex because you have more layers to worry about,” Thierry continues. “A user could say the application is slow. When was it slow? Over that period of time the configuration might have changed 50 times with virtualization. Now you have exacerbated the problem because the configuration is in motion. This guest was in this ESX server running at this moment in time across this network node. It becomes much more difficult to troubleshoot.”
The specific challenge of figuring out where a given application is at any particular moment is one of the greatest drivers behind BSM tool evolution today.
Mapping the New World
“People think virtualization means they don’t have to worry about the health of the internals of any of these VMs because they can just restart it, if it dies,” Soltero explains. “Or they can just clone it if they need a few more. That is not necessarily going to help you solve the problem. It may make things worse.”
Thierry concurs, explaining that if a business application has a critical failure based on a certain configuration that IT operations can’t reproduce, the common reaction is to simply reprovision.
“At some point you are going to need to fix that,” Thierry warns. “You need to understand what the configuration looks like at a moment in time, and understand the dependencies, and fix the problem. Just reprovisioning, just adding more resources, is not going to solve some of the service level issues you might have with an application.”
“How do I know where my business app is running?” Thierry asks. “Which exact server is that workload running on? Which exact guest? And how is it performing at that moment in time? These questions are critically important to the delivery of a business service.”
According to Thierry, the best solution is to maintain a dependency map between the application and the various layers of virtualization, all the way down to the hardware. As objects move around, that dependency map must be kept up to date in near real time, so IT operations always has the latest configuration to troubleshoot against.
“That’s the only way we think you can actually solve the problem,” says Thierry. “Otherwise you are shooting in the dark.”
“You can’t go back to your CMDB that’s two days old,” he continues. “That is not going to help you figure out why something didn’t perform. There are too many objects in motion. If you really start to push the boundaries of virtualization and make it truly dynamic, your configuration might be changing on an hourly basis. You need a tool that keeps up with that change.”
The complexity of the virtual world has caused many companies to continue to rely only on traditional physical monitoring. Most experts see physical monitoring continuing to be important, especially in the short-term, because most organizations utilizing virtualization are only virtualizing a percentage of their servers. In this “hybrid” environment, monitoring the physical components is still critical, and that requirement may never go away. On the other hand, performance monitoring must take place from a virtual perspective as well, if BSM is to succeed in the virtual environment.
“Because many organizations still don’t have a good handle on what runs on what, they look at performance only from the physical layer,” DuMoulin explains. “So they make their decisions based on a technical perspective but not in the business context, and then they run into business context issues. You need to know what runs on what to make business-oriented decisions of what to group together in the virtual environment on physical blades. Not many organizations have that context.”
“You not only need to look at the performance of your physical environment and the performance of your virtual environment, but you need to bring them together as whole,” Thierry recommends. “You can’t bring good performance to that application if you are not looking inside the guest operating system, to see how the application is performing inside the guest. You have to be able to look at physical to virtual and all the dependencies, and keep track of the dynamic changes as those objects start moving from one box to another, and keep that map of those dependencies updated.”
“The first and most important point is to provide visibility into the guest operating system and into its applications,” Soltero reiterates.
Living with Virtualization
“Virtualization is a fact of life,” DuMoulin states. “It is going to become more and more relevant.”
“Virtualization is here to stay,” Thierry agrees. “But there is no ITIL book for virtualization. There is no best practices framework yet for how to do this. Everyone is still at the beginning. We are going to get to a point where there will be best practices that say how many VMs can be deployed on a server and how you do dependency mapping – all will grow in time.”
Meanwhile, companies cannot simply sit back and wait, however. The advantages of virtualization are simply too compelling. Thierry advises companies to “dip their toes in the water” by virtualizing a portion of the infrastructure.
“Be careful not to go full throttle into virtualization until you have figured out exactly how you are going to manage service in that context,” Thierry says. “Going up to 30% virtualization is very manageable. If the executives then see all those cost savings and say let’s virtualize the planet across the entire company, that will cause tremendous chaos, because you will lose control and visibility into delivering service to your applications.”
“I think you need to get in there right away, but as you move to the next layers, you have to think very clearly about how you are going to make sure you don’t lose control,” he concludes. “You have to look very carefully at your tool sets. You have to integrate your processes. You might not want to turn on the dynamic VMotion until you are sure you can deliver service in a more static way on your virtualized servers. Ramp that throttle up slowly.”
EMA is about to embark on some new research entitled Data-Driven Automation: A Vision for the Modern CIO. We're trying to piece a puzzle together that so far we don't believe anyone to date has fully done — seek out where and how IT is moving toward integrated strategies for automation in context with real-world objectives and obstacles. We'll be looking at four use cases, each of will no doubt tell its own story ...
Many pitfalls await CIOs on the journey to the cloud. In fact, a majority of companies have been only partially successful, while some are outright failing. To learn more about this migration, Business Performance Innovation (BPI) Network surveyed IT and business executives and conducted in-depth interviews ...
The online retail industry has yet to have a Black Friday/Cyber Monday weekend unscathed by web performance (speed and availability) problems. Luckily, performance during 2019's hyper-critical online holiday shopping weekend was better than in years past, as we did not see any systemic, lengthy outages. While no website went completely down, several retailers did experience significant problems. Why have online retailers yet to figure out how to be crash-free during this all-important peak traffic period? We've identified several reasons for this ...
Gartner highlighted the trends that infrastructure and operations (I&O) leaders must start preparing for to support digital infrastructure in 2020 ...
Edge computing usage is starting to increase. The obvious follow-up question is, "So, what can I do with edge computing?" I'm glad you asked. There are lots of things you can do ...