Application Performance Management (APM), as defined by the industry, is focused on monitoring — because you can’t manage what you can’t see. But, there are other functions involved in managing application performance.
For instance, this month we saw news that Outlook.com’s outage was due to a failed firmware update. Monitoring is a key element of ensuring application performance — however, other functions, such as patch management, are necessary to proactively prevent service failures. Below are a few practical considerations when delving into managing application performance.
Measuring Application Performance — What Should You Care About?
Before you start to monitor anything, you need to understand the expectations from the application’s end-users. This will help you focus on the metrics that really matter and prioritize the type of monitoring solution that is required.
For instance, is up/down monitoring adequate? Is an agentless solution sufficient? Or is something more robust needed to collect log files and so on? It’s your duty to weigh the needs of the business (i.e. what’s the impact if monitoring is not in place?) against the cost of the monitoring solution.
Having the end-user conversation will also help you understand the resource requirements for an application. Oftentimes, applications are deployed with more resources than is actually needed to meet performance objectives.
Time to Measure and Monitor — How Do You Know Application Performance is Out of Whack?
Let’s first answer this question by understanding some of the things that can go wrong:
Resources are constrained. This could happen because there is an influx of demand on the application (more users/customers). Some apps simply use more memory the longer they run. Processes can get out of control. Resource constraints can also occur if resources are shared between applications (e.g. in a virtual environment where too many VMs on the same server, SAN capacity, etc.).
Services stop. This can be caused by a fatal exception, etc. These things happen unexpectedly, so it’s good to have monitoring in place to alert you when a service has stopped so you can restart it immediately.
Hardware fails. Power supplies go kaput, fans break, temperature spikes, and hard drives fail. These hardware failures can and do happen, so you need advanced warning to find them and fix them quickly.
Someone changed something and it broke. Oftentimes, configuration changes can lead to performance problems. Did the Web team update the site? Was there a software update outside of a change request? Keep these peripheral factors in mind.
You’ve been hacked. According to a recent study by Ponemon Institute, survey participants experienced almost two cyber-attacks per week, many of which are DDOS attacks, as witnessed recently by Brian Krebs’ website.
Software requires updating. More often, software needs to be updated due to vulnerabilities; however, many updates fix functional bugs. In the Outlook.com example mentioned above, some functional updates can cause service outages if not applied timely and correctly.
From step 1, you have an idea of where you should focus how much of your effort. Taking it to the next step is a little tricky. For example, your application owner needs the application to be available Monday – Friday between the hours of 8 a.m. and 5 p.m., he expects no more than 1,000 users at once, and he expects users to be able to process a transaction in three minutes.
With this information, you know critical alerts should fire during these business hours, it’s acceptable to perform software/firmware updates on the weekends or in the evening, and you have a baseline of acceptable performance from the end-user.
This application is comprised of several different components, including a Web server, application server, database and underlying hardware, storage, and networking elements. The SysAdmin is a jack of all trades who knows a little about a lot. What does it mean to monitor the SQL database? How does the SysAdmin monitor slow queries or table locks? What is a good value or a bad value? What should the threshold be?
Luckily, there are tools that can automate a lot of the guessing and manual reporting when it comes to application performance. Tools these days should provide intelligence to what should be monitored, historical data for benchmarks/troubleshooting, and also the ability to get to the necessary details quickly.
What to Look for in Tools that Help Manage Application Performance
Application and server monitoring tools should be able to monitor across multiple components of the application to include server hardware, virtual machines, processes, services and performance metrics specific to a particular application. Tools should also provide thresholds based off best practices of what can be adjusted with historical insight as needed.
Patch management tools should provide information on which systems are out of compliance, be able to patch systems at discrete times, and inform IT when patches fail.
Configuration change management toolsshould identify and repair unauthorized configuration changes.
The time and cost associated with implementing APM tools should certainly outweigh the cost of application degradation or outage, and the IT labor costs of manually finding and fixing the problem.
ABOUT Jennifer Kuvlesky
Jennifer Kuvlesky is a Product Marketing Manager for SolarWinds, specializing in systems management. She has made her home in Austin, the high-tech capital of Texas, for more than 15 years, specializing in product management, strategy and marketing with solid knowledge of the systems and application and virtualization management market segments. Connect with Jennifer Kuvlesky on twitter @jenniferkuvlesk.
IT Budget Help: 4 Steps to Align IT Spending to Business Goals
This blog presents the case for a radical new approach to basic information technology (IT) education. This conclusion is based on a study of courses and other forms of IT education which purport to cover IT "fundamentals" ...
To achieve maximum availability, IT leaders must employ domain-agnostic solutions that identify and escalate issues across all telemetry points. These technologies, which we refer to as Artificial Intelligence for IT Operations, create convergence — in other words, they provide IT and DevOps teams with the full picture of event management and downtime ...
APMdigest and leading IT research firm Enterprise Management Associates (EMA) are partnering to bring you the EMA-APMdigest Podcast, a new podcast focused on the latest technologies impacting IT Operations. In Episode 2 - Part 1 Pete Goldin, Editor and Publisher of APMdigest, discusses Network Observability with Shamus McGillicuddy, Vice President of Research, Network Infrastructure and Operations, at EMA ...
CIOs have stepped into the role of digital leader and strategic advisor, according to the 2023 Global CIO Survey from Logicalis ...
Synthetic monitoring is crucial to deploy code with confidence as catching bugs with E2E tests on staging is becoming increasingly difficult. It isn't trivial to provide realistic staging systems, especially because today's apps are intertwined with many third-party APIs ...
Recent EMA field research found that ServiceOps is either an active effort or a formal initiative in 78% of the organizations represented by a global panel of 400+ IT leaders. It is relatively early but gaining momentum across industries and organizations of all sizes globally ...
Managing availability and performance within SAP environments has long been a challenge for IT teams. But as IT environments grow more complex and dynamic, and the speed of innovation in almost every industry continues to accelerate, this situation is becoming a whole lot worse ...
Harnessing the power of network-derived intelligence and insights is critical in detecting today's increasingly sophisticated security threats across hybrid and multi-cloud infrastructure, according to a new research study from IDC ...
Recent research suggests that many organizations are paying for more software than they need. If organizations are looking to reduce IT spend, leaders should take a closer look at the tools being offered to employees, as not all software is essential ...
Organizations are challenged by tool sprawl and data source overload, according to the Grafana Labs Observability Survey 2023, with 52% of respondents reporting that their companies use 6 or more observability tools, including 11% that use 16 or more.