Using Machine Learning Analytics to Help Meet SLAs
October 11, 2016

Jerry Melnick
SIOS Technology

Share this

The first post in this two-part series introduced machine learning analytics as a new way to find and fix the root cause of performance problems to help meet SLAs. This post explains three ways MLA can be used to better utilize resources for optimal performance.

The first way MLA helps make certain needed performance is delivered while optimally use resources is by providing the accurate information needed for IT to tune VM configurations settings. IT managers today have poor insight into the causes of poor application performance. To be extra careful, they often throw a lot of hardware at the problem in an attempt to avoid the possibility of starving the applications.

In many cases applications can be over provisioned by as much as 80 percent. Under provisioning VMs is less common but equally problematic and can lead to very poor performance. Traditional processes for right-sizing VMs, is time-consuming, error-prone and inaccurate. IT administrators need the skill, time, and tools to run multiple reports, and then manually assemble their findings to approximate the right settings.

In contrast, MLA continuously and automatically observes resource utilization patterns using real-time data from the environment to identify over- and undersized VMs and then recommends precise configuration settings to right-size the VM for performance. And if usage changes, MLA will dynamically update recommendations.

The second way MLA helps improves utilization and save money is by finding unused or wasted resources. Among the many advantages of virtualization is the ease with which VMs can be set up and torn down and how storage can be dynamically allocated. But when unused VM’s or storage snapshots are left to languish, they waste precious resources. And these situations can be extremely difficult to identify given some of these may be seemingly unused when in fact they are being used! Removing these in error could be disastrous, so IT leaves them there.

MLA solves this by observing patterns of behavior over time over multiple dimensions to identify which VM’s are truly inactive and which storage snapshots are safe to be freed up. It then recommends precisely how to recover the waste. Once again eliminating the guess work.

Some MLA systems also provide a complete summary of savings that could be achieved by removing wasted resources and right sizing VM’s. They provide comprehensive reports that include not only the saving in hardware resources, but also the savings in software licensing that can be achieved by reducing the number of hosts and VMs.

The third way machine learning analytics helps optimize resource allocations for peak performance is by identifying those applications that would benefit the most from storage acceleration through the use of all-flash arrays or host-based caching (HBC). Storage acceleration delivers substantial improvements in throughput performance by increasing I/O operations per second (IOPS). But to be successful, IT managers need to verify that a) the root cause of their performance issue is related to storage performance and b) that they have chosen the right VMs and configured the storage acceleration optimally. Today, most use a trial-and-error approach and best guess usually using simple single dimension measurements from storage tools.

Machine learning is ideal for delivering the right information to make the decisions regarding which VMs need acceleration and how best configure them. Some MLA systems are also able to perform a simulation to estimate the likely increase in IOPS, which enables the IT department to prioritize the implementation effort.

Machine learning analytics brings machine derived intelligence to task of optimally configuring the infrastructure taking the guesswork out of many aspects involved in meeting SLAs more efficiently and cost-effectively. And with the technology advancing rapidly, its future holds tremendous potential for many new and even more powerful capabilities.

Jerry Melnick is President and CEO of SIOS Technology.

Jerry Melnick is President and CEO of SIOS Technology
Share this

The Latest

October 10, 2019

The requirements of an APM tool are now much more complex than they've ever been. Not only do they need to trace a user transaction across numerous microservices on the same system, but they also need to happen pretty fast ...

October 09, 2019

Performance monitoring is an old problem. As technology has advanced, we've had to evolve how we monitor applications. Initially, performance monitoring largely involved sending ICMP messages to start troubleshooting a down or slow application. Applications have gotten much more complex, so this is no longer enough. Now we need to know not just whether an application is broken, but why it broke. So APM has had to evolve over the years for us to get there. But how did this evolution take place, and what happens next? Let's find out ...

October 08, 2019

There are some IT organizations that are using DevOps methodology but are wary of getting bogged down in ITSM procedures. But without at least some ITSM controls in place, organizations lose their focus on systematic customer engagement, making it harder for them to scale ...

October 07, 2019
OK, I admit it. "Service modeling" is an awkward term, especially when you're trying to frame three rather controversial acronyms in the same overall place: CMDB, CMS and DDM. Nevertheless, that's exactly what we did in EMA's most recent research: <span style="font-style: italic;">Service Modeling in the Age of Cloud and Containers</span>. The goal was to establish a more holistic context for looking at the synergies and differences across all these areas ...
October 03, 2019

If you have deployed a Java application in production, you've probably encountered a situation where the application suddenly starts to take up a large amount of CPU. When this happens, application response becomes sluggish and users begin to complain about slow response. Often the solution to this problem is to restart the application and, lo and behold, the problem goes away — only to reappear a few days later. A key question then is: how to troubleshoot high CPU usage of a Java application? ...

October 02, 2019

Operations are no longer tethered tightly to a main office, as the headquarters-centric model has been retired in favor of a more decentralized enterprise structure. Rather than focus the business around a single location, enterprises are now comprised of a web of remote offices and individuals, where network connectivity has broken down the geographic barriers that in the past limited the availability of talent and resources. Key to the success of the decentralized enterprise model is a new generation of collaboration and communication tools ...

October 01, 2019

To better understand the AI maturity of businesses, Dotscience conducted a survey of 500 industry professionals. Research findings indicate that although enterprises are dedicating significant time and resources towards their AI deployments, many data science and ML teams don't have the adequate tools needed to properly collaborate on, build and deploy AI models efficiently ...

September 30, 2019

Digital transformation, migration to the enterprise cloud and increasing customer demands are creating a surge in IT complexity and the associated costs of managing it. Technical leaders around the world are concerned about the effect this has on IT performance and ultimately, their business according to a new report from Dynatrace, based on an independent global survey of 800 CIOs, Top Challenges for CIOs in a Software-Driven, Hybrid, Multi-Cloud World ...

September 26, 2019

APM tools are your window into your application's performance — its capacity and levels of service. However, traditional APM tools are now struggling due to the mismatch between their specifications and expectations. Modern application architectures are multi-faceted; they contain hybrid components across a variety of on-premise and cloud applications. Modern enterprises often generate data in silos with each outflow having its own data structure. This data comes from several tools over different periods of time. Such diversity in sources, structure, and formats present unique challenges for traditional enterprise tools ...

September 25, 2019

Today's organizations clearly understand the value of digital transformation and its ability to spark innovation. It's surprising that fewer than half of organizations have undertaken a digital transformation project. Workfront has identified five of the top challenges that IT teams face in digital transformation — and how to overcome them ...