This is Part 2 of a blog series on how to find root cause of the most common application experience issues.
Start with: Assuring Exceptional Experiences with Applications Requires Assuring Network Performance - Part 1
Responsiveness Issues
This type of issue is often reported as "the application is too slow." A likely root cause of unacceptable responsiveness resulting from network issues is an overloaded network (e.g., the capacity of the network is insufficient to handle the current traffic). If a network is overloaded, it is possible that its DNS server is also overloaded and either responds very slowly or not at all. Observing traffic bursts, especially microbursts, with detailed metrics is another indicator of an overloaded network and a cause of irregular latencies. If any of these are the root cause, then traffic must be shaped accordingly and/or capacity must be added.
When resolving these issues, IT teams analyze network, application and protocol latency using observed metrics such as DNS and HTTP latency, one-way latency, round-trip time, and Zero-Window activity. Additional observed behaviors and metrics will reveal which specific problem is the culprit. These metrics include throughput measured as gigabits per second (Gbps), the number of connections per second, and the number of concurrent connections. Network packet and flow data provides the insights and context to identify the root cause. Packet data captured with high fidelity using high-performance monitoring will detect and characterize traffic bursts and the number of connections per second. Flow data reveals top talkers and the number of packets transmitted per second.
Streaming Issues
Communications and streaming applications that use Voice over IP (VoIP), videoconferencing, and other streaming services are increasingly in use for entertainment, education and collaboration, especially in the COVID-19 era. Experiences with these applications are directly impacted by network performance.
Choppy and freezing video, unsynchronized audio and video, audio dropout, and other noticeable types of distortion are the typical issues that result in unsatisfactory experiences. These annoying issues are the result of streaming errors and packet loss that are readily noticed, complained about, and reported to IT and customer support help desks.
To diagnose the root causes and assure exceptional streaming experiences, IT needs to monitor and observe jitter, sequence errors, retransmissions, and Maximum Transmission Unit (MTU) fragmentation. Excessive jitter and sequence errors result from various streaming errors, while retransmissions and fragmentation indicate the packet loss as the culprit. It is necessary to dig further to determine whether these problems are caused by routing problems or MTU fragmentation. High MTU values mean that larger packets are transmitted that take relatively longer to process and retransmit and hence inhibit a smooth flow of digitized voice and video streams.
Other Performance Issues
The applications that rely on streaming services such as high frequency trading and high-performance computing, are increasingly relying on higher throughput that is driving the use of 100Gbps connectivity. Timing tolerances, latencies and all other performance metrics become finer as data rates increase. This necessitates higher fidelity monitoring to provide the necessary visibility and observability to ensure the best possible SLEs and MTTR. As an example, detecting gaps in high frequency trading streams requires observing microbursts and latencies with sub-millisecond resolution. Therefore, it is essential to have a clearly defined SLE, especially for high-performance applications and underlying infrastructure, then match to it the metrics to observe and the tools and resolution needed to do so.
Experiences impact organizations in many ways, which is why delivering exceptional experiences is a critical success factor. Experiences with applications depend on network performance. As a result, effectively and efficiently assuring experiences requires visibility and observability into both network and application behaviors and metrics. Network Performance Management and Diagnostics driven by monitoring is therefore a necessary complement to Application Performance Management in all environments.
The Latest
To achieve maximum availability, IT leaders must employ domain-agnostic solutions that identify and escalate issues across all telemetry points. These technologies, which we refer to as Artificial Intelligence for IT Operations, create convergence — in other words, they provide IT and DevOps teams with the full picture of event management and downtime ...
APMdigest and leading IT research firm Enterprise Management Associates (EMA) are partnering to bring you the EMA-APMdigest Podcast, a new podcast focused on the latest technologies impacting IT Operations. In Episode 2 - Part 1 Pete Goldin, Editor and Publisher of APMdigest, discusses Network Observability with Shamus McGillicuddy, Vice President of Research, Network Infrastructure and Operations, at EMA ...
CIOs have stepped into the role of digital leader and strategic advisor, according to the 2023 Global CIO Survey from Logicalis ...
Synthetic monitoring is crucial to deploy code with confidence as catching bugs with E2E tests on staging is becoming increasingly difficult. It isn't trivial to provide realistic staging systems, especially because today's apps are intertwined with many third-party APIs ...
Recent EMA field research found that ServiceOps is either an active effort or a formal initiative in 78% of the organizations represented by a global panel of 400+ IT leaders. It is relatively early but gaining momentum across industries and organizations of all sizes globally ...
Managing availability and performance within SAP environments has long been a challenge for IT teams. But as IT environments grow more complex and dynamic, and the speed of innovation in almost every industry continues to accelerate, this situation is becoming a whole lot worse ...
Harnessing the power of network-derived intelligence and insights is critical in detecting today's increasingly sophisticated security threats across hybrid and multi-cloud infrastructure, according to a new research study from IDC ...
Recent research suggests that many organizations are paying for more software than they need. If organizations are looking to reduce IT spend, leaders should take a closer look at the tools being offered to employees, as not all software is essential ...
Organizations are challenged by tool sprawl and data source overload, according to the Grafana Labs Observability Survey 2023, with 52% of respondents reporting that their companies use 6 or more observability tools, including 11% that use 16 or more.
An array of tools purport to maintain availability — the trick is sorting through the noise to find the right one. Let us discuss why availability is so important and then unpack the ROI of deploying Artificial Intelligence for IT Operations (AIOps) during an economic downturn ...