In today's IT world, with enterprises becoming more dependent on application-based business services, high availability and performance of applications are becoming increasingly important. Enterprises have started virtualizing their IT infrastructure more than ever, and this has resulted in applications becoming virtualized as well. Virtualization is putting application performance to the fore. According to an industry survey by Gridstore, improving application performance (51%) was the top business priority for virtualized environments in mid-market organizations.
This was followed by the need to reduce I/O bottlenecks between virtual machines (VM) and storage (34%), the need for increased VM density (34%), the need to decrease storage costs (27%), and the need for improved manageability for virtualized systems (24%).
It's always hard for the IT teams to find out what the issue is with slow and unresponsive applications on the network. If the issue is with the application itself, then application performance monitoring (APM) tools will help monitor that for you, but when the problem actually spirals down to the virtual infrastructure hosting the app, then it becomes more complex to pinpoint the source of the issue. Needless to say, troubleshooting becomes even more difficult.
Synergy of both APM and virtualization monitoring technology becomes key to providing more visibility across the application stack – from the app to the VM to the underlying host resources including storage. This tight integration between both the technologies is what IT systems, virtual, storage and data center admins are looking for.
The following covers some important use cases of application performance in a virtualized environment, and what you need to monitor to avoid performance issues and application downtime.
Why is the Application Slow?
Well, it could be slow because of an application performance issue; it could be due to a slow VM; or, it could be that there are not enough storage resources for the VMs to use on the host.
Application Issue: APM tools will help you monitor the application metrics and diagnose the cause for application slowdown. Whether it is application downtime or latency, you will be able to drill down into the various performance components and identify the problem.
VM Issue: If all is well with the application, it could be the VM workload that is causing the application running on it to be slow. You need access to VM performance metrics to identify any resource contention on the host so that you can troubleshoot application and workload issues in context of discovered virtual dependencies. Typically, it could be the depletion of CPU, memory or disk associated with the host that can cause bottlenecks and result in degraded VM performance.
Datastore Issue: If the VM seems to be in good shape, it could be a datastore issue. You will want to take a closer look at the datastore IOPS, latency, and if any datastore is running low on disk space.
This monitoring is from the app to the VM and datastore. The other way round should also be possible. From the host, you need to be able to identify the VM, then look at the datastore data which might also show applications related to this datastore, along with their status.
What is Causing the VM to Slow Down?
VM Performance Issue: If a VM running your business apps is slow, you need visibility into VM performance metrics such as CPU, memory, IOPS, network throughput, etc. If there’s any resource contention causing workload to choke the performance of the VM, you should be able to get alerts pinpointing what issue happened and when. This will help you identify the cause of a VM slowdown.
Datastore Bottleneck: If the VM metrics all look good, you can explore the storage data for any bottleneck. Maybe there are more VMs than the datastore could support. You need to be able to find out which VMs are using the most storage or about to run out of storage. Even too many VMs on a datastore would result in contention of resource and cause the VMs to be sluggish.
Storage Capacity, Usage and Performance
A datastore performance analysis is a useful resource for drilling down into VM storage resources to see what the storage usage has been, how much more load can your VM environment sustain and where performance issues occur. By exploring the datastore data for your VMware datastores, and local storage and cluster shared volumes (CSV) for Hyper-V hosts, you can see storage capacity and performance across the entire virtual environment and diagnose where you are having issues. You can easily identify:
Usage: Based on busiest datastores – drill down to identify which VM or application is causing the load.
Performance: Based on busiest VMs – drill to the datastore and see what other VMs are affected.
Capacity: Which datastores and VMs are low on storage and which ones will run out first.
Storage doesn't stop with the VM datastores and extends to external storage arrays if you are maintaining a SAN environment for storage and backup. Performance issues in the storage LUNs can lead to VM slow down which will in turn affect application performance. You should be able to identify which VM is mapped to which storage arrays so that you can identify and stop VM issues by troubleshooting and fixing the storage environment.
Monitor your entire application stack from the app to the VM to the host to that datastore and external storage. Don’t let an unnoticed bottleneck throttle your application performance and availability.
Industry experts offer thoughtful, insightful, and often controversial predictions on how APM, AIOps, Observability, OpenTelemetry and related technologies will evolve and impact business in 2023. Part 4 covers monitoring, site reliability engineering and ITSM ...
Industry experts offer thoughtful, insightful, and often controversial predictions on how APM, AIOps, Observability, OpenTelemetry and related technologies will evolve and impact business in 2023. Part 3 covers OpenTelemetry ...
Industry experts offer thoughtful, insightful, and often controversial predictions on how APM, AIOps, Observability, OpenTelemetry and related technologies will evolve and impact business in 2023. Part 2 covers more on observability ...
The Holiday Season means it is time for APMdigest's annual list of Application Performance Management (APM) predictions, covering IT performance topics. Industry experts — from analysts and consultants to the top vendors — offer thoughtful, insightful, and often controversial predictions on how APM, observability, AIOps and related technologies will evolve and impact business in 2023. Part 1 covers APM and Observability ...
You could argue that, until the pandemic, and the resulting shift to hybrid working, delivering flawless customer experiences and improving employee productivity were mutually exclusive activities. Evidence from Catchpoint's recently published Site Reliability Engineering (SRE) industry report suggests this is changing ...