Application Performance Management (APM), as defined by the industry, is focused on monitoring — because you can’t manage what you can’t see. But, there are other functions involved in managing application performance.
For instance, this month we saw news that Outlook.com’s outage was due to a failed firmware update. Monitoring is a key element of ensuring application performance — however, other functions, such as patch management, are necessary to proactively prevent service failures. Below are a few practical considerations when delving into managing application performance.
Measuring Application Performance — What Should You Care About?
Before you start to monitor anything, you need to understand the expectations from the application’s end-users. This will help you focus on the metrics that really matter and prioritize the type of monitoring solution that is required.
For instance, is up/down monitoring adequate? Is an agentless solution sufficient? Or is something more robust needed to collect log files and so on? It’s your duty to weigh the needs of the business (i.e. what’s the impact if monitoring is not in place?) against the cost of the monitoring solution.
Having the end-user conversation will also help you understand the resource requirements for an application. Oftentimes, applications are deployed with more resources than is actually needed to meet performance objectives.
Time to Measure and Monitor — How Do You Know Application Performance is Out of Whack?
Let’s first answer this question by understanding some of the things that can go wrong:
Resources are constrained. This could happen because there is an influx of demand on the application (more users/customers). Some apps simply use more memory the longer they run. Processes can get out of control. Resource constraints can also occur if resources are shared between applications (e.g. in a virtual environment where too many VMs on the same server, SAN capacity, etc.).
Services stop. This can be caused by a fatal exception, etc. These things happen unexpectedly, so it’s good to have monitoring in place to alert you when a service has stopped so you can restart it immediately.
Hardware fails. Power supplies go kaput, fans break, temperature spikes, and hard drives fail. These hardware failures can and do happen, so you need advanced warning to find them and fix them quickly.
Someone changed something and it broke. Oftentimes, configuration changes can lead to performance problems. Did the Web team update the site? Was there a software update outside of a change request? Keep these peripheral factors in mind.
You’ve been hacked. According to a recent study by Ponemon Institute, survey participants experienced almost two cyber-attacks per week, many of which are DDOS attacks, as witnessed recently by Brian Krebs’ website.
Software requires updating. More often, software needs to be updated due to vulnerabilities; however, many updates fix functional bugs. In the Outlook.com example mentioned above, some functional updates can cause service outages if not applied timely and correctly.
From step 1, you have an idea of where you should focus how much of your effort. Taking it to the next step is a little tricky. For example, your application owner needs the application to be available Monday – Friday between the hours of 8 a.m. and 5 p.m., he expects no more than 1,000 users at once, and he expects users to be able to process a transaction in three minutes.
With this information, you know critical alerts should fire during these business hours, it’s acceptable to perform software/firmware updates on the weekends or in the evening, and you have a baseline of acceptable performance from the end-user.
This application is comprised of several different components, including a Web server, application server, database and underlying hardware, storage, and networking elements. The SysAdmin is a jack of all trades who knows a little about a lot. What does it mean to monitor the SQL database? How does the SysAdmin monitor slow queries or table locks? What is a good value or a bad value? What should the threshold be?
Luckily, there are tools that can automate a lot of the guessing and manual reporting when it comes to application performance. Tools these days should provide intelligence to what should be monitored, historical data for benchmarks/troubleshooting, and also the ability to get to the necessary details quickly.
What to Look for in Tools that Help Manage Application Performance
Application and server monitoring tools should be able to monitor across multiple components of the application to include server hardware, virtual machines, processes, services and performance metrics specific to a particular application. Tools should also provide thresholds based off best practices of what can be adjusted with historical insight as needed.
Patch management tools should provide information on which systems are out of compliance, be able to patch systems at discrete times, and inform IT when patches fail.
Configuration change management toolsshould identify and repair unauthorized configuration changes.
The time and cost associated with implementing APM tools should certainly outweigh the cost of application degradation or outage, and the IT labor costs of manually finding and fixing the problem.
ABOUT Jennifer Kuvlesky
Jennifer Kuvlesky is a Product Marketing Manager for SolarWinds, specializing in systems management. She has made her home in Austin, the high-tech capital of Texas, for more than 15 years, specializing in product management, strategy and marketing with solid knowledge of the systems and application and virtualization management market segments. Connect with Jennifer Kuvlesky on twitter @jenniferkuvlesk.
Scaling DevOps and SRE practices is critical to accelerating the release of high-quality digital services. However, siloed teams, manual approaches, and increasingly complex tooling slow innovation and make teams more reactive than proactive, impeding their ability to drive value for the business, according to a new report from Dynatrace, Deep Cloud Observability and Advanced AIOps are Key to Scaling DevOps Practices ...
Over three quarters (79%) of database professionals are now using either a paid-for or in-house monitoring tool, according to a new survey from Redgate Software ...
Gartner announced the top strategic technology trends that organizations need to explore in 2022. With CEOs and Boards striving to find growth through direct digital connections with customers, CIOs' priorities must reflect the same business imperatives, which run through each of Gartner's top strategic tech trends for 2022 ...
Distributed tracing has been growing in popularity as a primary tool for investigating performance issues in microservices systems. Our recent DevOps Pulse survey shows a 38% increase year-over-year in organizations' tracing use. Furthermore, 64% of those respondents who are not yet using tracing indicated plans to adopt it in the next two years ...
Businesses are embracing artificial intelligence (AI) technologies to improve network performance and security, according to a new State of AIOps Study, conducted by ZK Research and Masergy ...
What may have appeared to be a stopgap solution in the spring of 2020 is now clearly our new workplace reality: It's impossible to walk back so many of the developments in workflow we've seen since then. The question is no longer when we'll all get back to the office, but how the companies that are lagging in their technological ability to facilitate remote work can catch up ...
The pandemic accelerated organizations' journey to the cloud to enable agile, on-demand, flexible access to resources, helping them align with a digital business's dynamic needs. We heard from many of our customers at the start of lockdown last year, saying they had to shift to a remote work environment, seemingly overnight, and this effort was heavily cloud-reliant. However, blindly forging ahead can backfire ...
SmartBear recently released the results of its 2021 State of Software Quality | Testing survey. I doubt you'll be surprised to hear that a "lack of time" was reported as the number one challenge to doing more testing, especially as release frequencies continue to increase. However, it was disheartening to see that a lack of time was also the number one response when we asked people to identify the biggest blocker to professional development ...
The role of the CIO is evolving with an increased focus on unlocking customer connections through service innovation, according to the 2021 Global CIO Survey. The study reveals the shift in the role of the CIO with the majority of CIO respondents stating innovation, operational efficiency, and customer experience as their top priorities ...
The perception of IT support has dramatically improved thanks to the successful response of service desks to the pandemic, lockdowns and working from home, according to new research from the Service Desk Institute (SDI), sponsored by Sunrise Software ...