In the world of Application Performance Management (APM) it is always better to enlist more than one entity to help solve the mystery of performance problems.
It's kind of like arriving at the scene of the crime on foreign soil, being blindfolded, shoved out the door, and then asked to help solve the injustice without any insight. All you can do is begin by asking people in the vicinity, providing you speak their language, for information on what they have seen (i.e. end-user-experience).
Gathering facts related to a crime is essential, and can be likened to utilizing an APM solution for solving application performance problems. The more information about an application’s behavior that you can obtain, along with understanding its idiosyncrasies within the environment, the more likely you will be able to pinpoint root causes of performance issues.
The Three People You Need
Wouldn't it be helpful if there was an eye witness you could interview, a watchman who was on duty during the time of the incident, and an agent you could hire to translate the native tongue and provide insight into the culture?
In much the same way, a smart APM strategy enlists the help from these three entities: the Witness, the Watchman, and the Agent. You start by listening to the testimony from the eye witness (aka. wire data), collecting the observations from the watchman (aka. web robots), and analyzing details from the agent (aka. code level instrumentation).
Passive monitoring, wire-data analytics
The Witness reports what they see within their field of vision, (aka. passive monitoring, wire-data analytics). The Witness is watching everything in their purview and sees things as they happen, which corresponds to what is coming across "the wire" in front of them.
The Witness will tell you how many people were involved, if anyone was injured, and what time the event occurred, (e.g. user names, packet loss, timelines, etc.). She can tell you what doors the people went through, how wide the aisles were, and how fast people were traveling, (e.g. network port listeners, realized bandwidth, round-trip-time, etc.).
Active monitoring - synthetic transactions
The Watchman (aka. web robot) is actively checking and is always on patrol, methodically taking the same path every time. He will tell you what doors are locked and monitor the ones that are open, collecting measurements along the way on how long it takes to complete his rounds, (i.e. synthetic transactions).
The Watchman will report the status of the rooms and buildings on his patrol and will note if anything happens to him along the way, (e.g. application availability, transaction errors, timeouts, etc.).
Application code instrumentation
The Agent you hire is critical for solving the crime within the territory you're operating in. The Agent will watch activity from specific vantage points throughout the environment and report back his findings. It's crucial he speaks the local language, (e.g. Java, .Net, PHP) and can easily translate for you.
His approach will be to deploy probes on rooftops and inside the buildings for monitoring all conversations and actions in the environment, (aka. application code instrumentation). He will also tap the communication systems, (i.e. script injection) when appropriate and capture specific measurements from each conversation and record them.
Going from Red to Green
Identifying an application that has gone catatonic is one thing, but assessing the insidious slow performance of a complex multi-tiered application and fixing it, can be very time consuming and costly. Enlisting all three entities described above to assist is a thoughtful strategy for any IT Leader to consider.
Based on eye witness testimony, the forensics collected, and the conversations recorded, you will be well on your way to providing an accurate account of what has transpired and why, (i.e. root cause analysis).
Remember, the end-user is the supreme judge in this case and if performance is chronically slow, your sentence could be harsh. Either directly by inundating you with complaints creating bad press or indirectly by abandoning your site in favor of one that is much faster and more intuitive to use.
Embracing a smart but simple APM Methodology within your environment may be the only thing that exonerates you when the verdict for your slow application is "guilty as charged."
The "APM" solutions we've come to love over the last 2 decades can't handle Serverless Functions or deliver the same performance and operational details that they deliver for other architectural constructs — including App Servers, Frameworks, Cloud, even Containers. And the reason is that they're methodologies for collecting performance data simply won't operate with the same characteristics as it would in persistent code ...
I asked myself this important question: Can APM tools Manage Serverless Workloads? And the answer is "No, not really." It is true that every monitoring solution in the world claims support for monitoring serverless platforms (at least one of them). What I mean by my answer is that the "APM" solutions we've come to love over the last 2 decades can't handle Serverless Functions or deliver the same performance and operational details that they deliver for other architectural constructs ...
In Episode 11, Andrew Tunall, GM, New Relic Serverless & Emerging Cloud Services, joins the AI+ITOPS Podcast to discuss the challenges and advantages of distributed tracing ...
IT teams critically require better visibility into the network driven by a number of factors, including tremendous disruption from the COVID-19 pandemic, relentless technological advances, remote working reaching an all-time high and the expanding security threatscape, according to State of the Network 2020, a study conducted by VIAVI Solutions ...
Mobile commerce offers several benefits for retailers. But all this potential can only be fully realized if retailers can manage the associated challenges that mobile commerce introduces. Anyone involved in the development, operation or troubleshooting of a mobile shopping app needs to be aware of the three following technical obstacles and plan accordingly ...
Although cost control/expense management remains top of mind, organizations are realizing the necessity of technology solutions to enable them to steer the business during these turbulent times, according to IDG's CIO Pandemic Business Impact Study ...
The COVID-19 pandemic has compressed six years of modernization projects into 6 months. According to a recent report, IT leaders have accelerated projects aimed at increasing productivity and business agility, improving application performance and end-user experience, and driving additional revenue through existing channels ...
There is no doubt that automation has become the key aspect of modern IT management. The end-user computing market is no exception. With a large and complex technology stack and a huge number of applications, EUC specialists need to handle an ever-increasing number of changes at an ever-increasing rate. Many IT organizations are starting to realize that they can no longer control the flow of changes. It is time to think about how to facilitate change ...
Starting this September, the lifespan of an SSL/TLS certificate has been limited to 398 days, a reduction from the previous maximum certificate lifetime of 825 days. With this change, everyone needs to more carefully monitor SSL certificate expiration and server characteristics ...
Nearly 6 in 10 responding organizations have accelerated their digital transformations due to the COVID-19 pandemic, according to The IBM Institute for Business Value study COVID-19 and the Future of Business ...