Using Machine Learning Analytics to Help Meet SLAs
October 11, 2016

Jerry Melnick
SIOS Technology

Share this

The first post in this two-part series introduced machine learning analytics as a new way to find and fix the root cause of performance problems to help meet SLAs. This post explains three ways MLA can be used to better utilize resources for optimal performance.

The first way MLA helps make certain needed performance is delivered while optimally use resources is by providing the accurate information needed for IT to tune VM configurations settings. IT managers today have poor insight into the causes of poor application performance. To be extra careful, they often throw a lot of hardware at the problem in an attempt to avoid the possibility of starving the applications.

In many cases applications can be over provisioned by as much as 80 percent. Under provisioning VMs is less common but equally problematic and can lead to very poor performance. Traditional processes for right-sizing VMs, is time-consuming, error-prone and inaccurate. IT administrators need the skill, time, and tools to run multiple reports, and then manually assemble their findings to approximate the right settings.

In contrast, MLA continuously and automatically observes resource utilization patterns using real-time data from the environment to identify over- and undersized VMs and then recommends precise configuration settings to right-size the VM for performance. And if usage changes, MLA will dynamically update recommendations.

The second way MLA helps improves utilization and save money is by finding unused or wasted resources. Among the many advantages of virtualization is the ease with which VMs can be set up and torn down and how storage can be dynamically allocated. But when unused VM’s or storage snapshots are left to languish, they waste precious resources. And these situations can be extremely difficult to identify given some of these may be seemingly unused when in fact they are being used! Removing these in error could be disastrous, so IT leaves them there.

MLA solves this by observing patterns of behavior over time over multiple dimensions to identify which VM’s are truly inactive and which storage snapshots are safe to be freed up. It then recommends precisely how to recover the waste. Once again eliminating the guess work.

Some MLA systems also provide a complete summary of savings that could be achieved by removing wasted resources and right sizing VM’s. They provide comprehensive reports that include not only the saving in hardware resources, but also the savings in software licensing that can be achieved by reducing the number of hosts and VMs.

The third way machine learning analytics helps optimize resource allocations for peak performance is by identifying those applications that would benefit the most from storage acceleration through the use of all-flash arrays or host-based caching (HBC). Storage acceleration delivers substantial improvements in throughput performance by increasing I/O operations per second (IOPS). But to be successful, IT managers need to verify that a) the root cause of their performance issue is related to storage performance and b) that they have chosen the right VMs and configured the storage acceleration optimally. Today, most use a trial-and-error approach and best guess usually using simple single dimension measurements from storage tools.

Machine learning is ideal for delivering the right information to make the decisions regarding which VMs need acceleration and how best configure them. Some MLA systems are also able to perform a simulation to estimate the likely increase in IOPS, which enables the IT department to prioritize the implementation effort.

Machine learning analytics brings machine derived intelligence to task of optimally configuring the infrastructure taking the guesswork out of many aspects involved in meeting SLAs more efficiently and cost-effectively. And with the technology advancing rapidly, its future holds tremendous potential for many new and even more powerful capabilities.

Jerry Melnick is President and CEO of SIOS Technology.

Jerry Melnick is President and CEO of SIOS Technology
Share this

The Latest

October 01, 2020

The cloud has recently proven to be a vital tool for many organizations to deal with the COVID-19 pandemic by enabling employees to work from home. To me, COVID-19 has clearly shown that work doesn't need to happen at the office. It has strengthened our belief that working from home is going to be the norm for many. The move to the cloud introduces many technical challenges ...

September 30, 2020

Legacy tools traditionally utilized by IT organizations for alerting and on-premises performance monitoring are inadequate in this age of WFH and multi-cloud integration. A true Digital Experience Monitoring (DEM) strategy ensures that optimizing the end-user experience for these tools is critical for better performance and higher productivity ...

September 29, 2020

More than 80% of organizations have experienced a significant increase in pressure on digital services since the start of the COVID-19 pandemic, according to a new study conducted by PagerDuty ...

September 28, 2020

In Episode 9, Sean McDermott, President, CEO and Founder of Windward Consulting Group, joins the AI+ITOPS Podcast to discuss how the pandemic has impacted IT and is driving the need for AIOps ...

September 25, 2020

Michael Olson on the AI+ITOPS Podcast: "I really see AIOps as being a core requirement for observability because it ... applies intelligence to your telemetry data and your incident data ... to potentially predict problems before they happen."

September 24, 2020

Enterprise ITOM and ITSM teams have been welcoming of AIOps, believing that it has the potential to deliver great value to them as their IT environments become more distributed, hybrid and complex. Not so with DevOps teams. It's safe to say they've kept AIOps at arm's length, because they don't think it's relevant nor useful for what they do. Instead, to manage the software code they develop and deploy, they've focused on observability ...

September 23, 2020

The post-pandemic environment has resulted in a major shift on where SREs will be located, with nearly 50% of SREs believing they will be working remotely post COVID-19, as compared to only 19% prior to the pandemic, according to the 2020 SRE Survey Report from Catchpoint and the DevOps Institute ...

September 22, 2020

All application traffic travels across the network. While application performance management tools can offer insight into how critical applications are functioning, they do not provide visibility into the broader network environment. In order to optimize application performance, you need a few key capabilities. Let's explore three steps that can help NetOps teams better support the critical applications upon which your business depends ...

September 21, 2020

In Episode 8, Michael Olson, Director of Product Marketing at New Relic, joins the AI+ITOPS Podcast to discuss how AIOps provides real benefits to IT teams ...

September 18, 2020

Will Cappelli on the AI+ITOPS Podcast: "I'll predict that in 5 years time, APM as we know it will have been completely mutated into an observability plus dynamic analytics capability."