Skip to main content

The Importance of Network Observability for Tech Companies

Nadeem Zahid
cPacket Networks

Tech companies tend to be the earliest adopters of IT and digital transformation trends, for obvious reasons. These companies have already embraced a cloud-first mentality, and are well in to migrating business-critical workloads to the cloud. However, that "tip of the spear" position in regard to cloud adoption puts these companies at considerable risk of losing visibility into application workloads, leaving them to struggle to detect performance issues and potential threats.

The challenge is that cloud monitoring and visibility is hard, especially for public clouds, which tend to be a black box when it comes to observability. This balancing act between enthusiastic cloud adoption and consistent and complete visibility is crucial for big tech to get right, for two reasons.

First, the heavy reliance on SaaS-based apps (both as a product offering and for internal usage) and cloud data means that IT teams must maintain network performance and rapidly troubleshoot in hybrid cloud environments. A few seconds (or even milliseconds) of performance latency can lead to frustrated employees and customers.

Second, tech companies are prime targets for attackers. The financial and reputational damage of a security breach, especially for high-value targets such as large fintech companies, can easily ruin a company's image and operation. Security teams need both a real-time, reliable feed of packet data for their NDR and firewall tools, and a store of packet data going back weeks for forensic investigations.

Building the visibility infrastructure to make these cloud networks observable is a complex technical challenge. But with careful planning and a few strategic decisions, it's possible to appropriately design, set up and manage visibility solutions for the cloud.

Observability Challenges for Security and NPM

One of the key mandates for IT teams is ensuring consistency, making network performance monitoring (NPM) a high priority. If there's a problem, IT needs the ability to quickly trace it to a specific application, then onto specific nodes or parts of the public/private cloud infrastructure to solve the problem.

If the cloud provider is at fault, then IT will need detailed packet data to prove an SLA is being violated. Without that data, the troubleshooting can quickly devolve into useless finger pointing. (Yet turning on a cloud provider's built-in traffic mirroring and then investigating performance issues will take weeks.) To be useful, visibility must be in place before the issue arises.

Unfortunately, you can't just throw a switch to get access to packet data through traffic mirroring. In particular, managing the "fire hose" of cloud data in real-time for these mirroring scenarios is technically challenging.

Security is the other side of the observability coin and (at the risk of stretching the metaphor to breaking) it has two sides. The first is getting access to real-time packet data; this is similar to the performance monitoring challenge above, but with unique nuances. The second issue is the ability to save packets for forensic investigation.

For security purposes, real-time packet data feeds must go to security tools like NDR and firewalls. Not missing any of these packets is crucial; for cloud this makes an inline packet solution ideal. That said, security tools can often only ingest packets at 10G speeds, so faster connections will require a packet broker that can handle both 10G and 40/100G traffic. In terms of the packets themselves, traffic that is traversing environments, either between an application and the open internet or between the data center and the cloud, is often of particular interest to security teams as these can be likely entry points for an intruder. Unfortunately, this traffic can be particularly difficult to monitor.

For forensic analysis, security team investigations will require packet data that covers days or weeks of traffic between critical nodes. This means observability plans need to cover not just packet access, but capture and storage as well.

When setting up the monitoring infrastructure, several factors must also be weighed. At a basic level, the brokers, taps, capture devices, etc. all take up valuable rack space; consolidation, density and adequate topology planning are all critical. If data that's being monitored is sensitive or subject to privacy regulations, access to the visibility system and data must be controlled. The monitoring itself also creates a technical load on the network that must be accounted for (you don't want the monitoring itself to be the cause of performance issues).

Bridging the Visibility Gap

The appropriate monitoring infrastructure should be built around a subnet comprised of a load balancer, virtual packet broker and storage appliance, with equipment placed throughout the network at key points. One strategy to conserve space, save money and maximize resources is to use brokers as the "power strip" that distributes packets to firewalls and other security or NPM tools at the correct speeds. The subnet can further connect packet capture and storage to forensic tools for investigation, and feed NPM tools with real-time data to quickly triangulate network issues like latency, allowing IT teams to determine fault and, if necessary, negotiate with the cloud provider.

As mentioned, access to packet data in the public cloud is particularly difficult. The hyperscale providers all recognize the problems this lack of visibility causes, and each have taken different paths to solving it. AWS and GCP use similar mirroring approaches (VPC traffic (AWS) or packet (GCP) mirroring service). In basic terms, this traffic/packet mirroring duplicates network traffic to and from the client's applications and forwards it to cloud-native performance and security monitoring tool sets for assessment, and to capture devices for later analysis. This eliminates the need to deploy ad-hoc forwarding agents or sensors in each VPC instance for every monitoring tool. The raw data itself is not ready for analysis, and requires a virtual or cloud packet broker to ensure the right data gets to the right monitoring or security tools. That said, combining these mirroring options with virtual packet brokers can ultimately reduce cost, as a single stream only has to be mirrored once for the broker (as opposed to once per each NPM or security tool).

Solving the visibility challenge with Azure is different, and requires using what's known as "inline mode" on certain virtual packet brokers. This allows the packet broker itself to monitor subnet ingress and egress traffic to capture, pre-process, and deliver packet data in real-time to security, performance management, analytics, capture and other solutions.

Developing this visibility topology is complex; many companies may not have the necessary in-house staff to handle it, and may need to work with service providers or vendors on the design and set-up. But whether handled in-house or outsourced, keep tool and infrastructure sprawl in mind: a mixture of virtual and physical devices can save rack space in data centers, and leveraging the cloud for a consolidated management view of all packet broker and capture solutions can save considerable time.

Tech companies often take the slings and arrows that come with early adoption. But paying careful attention to visibility and monitoring allows organizations to better weather these issues by staying on-top of threats and ensuring the network is operating according to plan.

Nadeem Zahid is VP of Product Management & Marketing at cPacket Networks

The Latest

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...

40% of organizations deploying AI will implement dedicated AI observability tools by 2028 to monitor model performance, bias and outputs, according to Gartner ...

Until AI-powered engineering tools have live visibility of how code behaves at runtime, they cannot be trusted to autonomously ensure reliable systems, according to the State of AI-Powered Engineering Report 2026 report from Lightrun. The report reveals that a major volume of manual work is required when AI-generated code is deployed: 43% of AI-generated code requires manual debugging in production, even after passing QA or staging tests. Furthermore, an average of three manual redeploy cycles are required to verify a single AI-suggested code fix in production ...

Many organizations describe AI as strategic, but they do not manage it strategically. When AI plans are disconnected from strategy, detached from organizational learning, and protected from serious assumptions testing, the problem is no longer technical immaturity; it is a failure of management discipline ... Executives too often tell organizations to "use AI" before they define what AI is supposed to change. The problem deepens in organizations where strategy isn't well articulated in the first place ...

Across the enterprise technology landscape, a quiet crisis is playing out. Organizations have run hundreds, sometimes thousands, of generative AI pilots. Leadership has celebrated the proof of concept (POCs) ... Industry experience points to a sobering reality: only 5-10% of AI POCs that progress to the pilot stage successfully reach scaled production. The remaining 90% fail because the enterprise environment around them was never ready to absorb them, not the AI models ...

Today's modern systems are not what they once were. Organizations now rely on distributed systems, event-driven workflows, hybrid and multi-cloud environments and continuous delivery pipelines. While each adds flexibility, it also introduces new, often invisible failures. Development speed is no longer the primary bottleneck of innovation. Reliability is ...

Seeing is believing, or in this case, seeing is understanding, according to New Relic's 2025 Observability Forecast for Retail and eCommerce report. Retailers who want to provide exceptional customer experiences while improving IT operations efficiency are leaning on observability ... Here are five key takeaways from the report ...

Technology leaders across the federal landscape are facing, and will continue to face, an uphill battle when it comes to fortifying their digital environments against hostile and persistent threat actors. On one hand, they are being asked to push digital transformation ... On the other hand, they are facing the fiscal uncertainty of continuing resolutions (CR) and government shutdowns looming near and far. In the face of these challenges, CIOs, CTOs, and CISOs must figure out how to modernize legacy systems and infrastructure while doing more with less and still defending against external and internal threats ...

Reliability is no longer proven by uptime alone, according to the The SRE Report 2026 from LogicMonitor. In the AI era, it is experienced through speed, consistency, and user trust, and increasingly judged by business impact. As digital services grow more complex and AI systems move into production, traditional monitoring approaches are struggling to keep pace, increasing the need for AI-first observability that spans applications, infrastructure, and the Internet ...

If AI is the engine of a modern organization, then data engineering is the road system beneath it. You can build the most powerful engine in the world, but without paved roads, traffic signals, and bridges that can support its weight, it will stall. In many enterprises, the engine is ready. The roads are not ...

The Importance of Network Observability for Tech Companies

Nadeem Zahid
cPacket Networks

Tech companies tend to be the earliest adopters of IT and digital transformation trends, for obvious reasons. These companies have already embraced a cloud-first mentality, and are well in to migrating business-critical workloads to the cloud. However, that "tip of the spear" position in regard to cloud adoption puts these companies at considerable risk of losing visibility into application workloads, leaving them to struggle to detect performance issues and potential threats.

The challenge is that cloud monitoring and visibility is hard, especially for public clouds, which tend to be a black box when it comes to observability. This balancing act between enthusiastic cloud adoption and consistent and complete visibility is crucial for big tech to get right, for two reasons.

First, the heavy reliance on SaaS-based apps (both as a product offering and for internal usage) and cloud data means that IT teams must maintain network performance and rapidly troubleshoot in hybrid cloud environments. A few seconds (or even milliseconds) of performance latency can lead to frustrated employees and customers.

Second, tech companies are prime targets for attackers. The financial and reputational damage of a security breach, especially for high-value targets such as large fintech companies, can easily ruin a company's image and operation. Security teams need both a real-time, reliable feed of packet data for their NDR and firewall tools, and a store of packet data going back weeks for forensic investigations.

Building the visibility infrastructure to make these cloud networks observable is a complex technical challenge. But with careful planning and a few strategic decisions, it's possible to appropriately design, set up and manage visibility solutions for the cloud.

Observability Challenges for Security and NPM

One of the key mandates for IT teams is ensuring consistency, making network performance monitoring (NPM) a high priority. If there's a problem, IT needs the ability to quickly trace it to a specific application, then onto specific nodes or parts of the public/private cloud infrastructure to solve the problem.

If the cloud provider is at fault, then IT will need detailed packet data to prove an SLA is being violated. Without that data, the troubleshooting can quickly devolve into useless finger pointing. (Yet turning on a cloud provider's built-in traffic mirroring and then investigating performance issues will take weeks.) To be useful, visibility must be in place before the issue arises.

Unfortunately, you can't just throw a switch to get access to packet data through traffic mirroring. In particular, managing the "fire hose" of cloud data in real-time for these mirroring scenarios is technically challenging.

Security is the other side of the observability coin and (at the risk of stretching the metaphor to breaking) it has two sides. The first is getting access to real-time packet data; this is similar to the performance monitoring challenge above, but with unique nuances. The second issue is the ability to save packets for forensic investigation.

For security purposes, real-time packet data feeds must go to security tools like NDR and firewalls. Not missing any of these packets is crucial; for cloud this makes an inline packet solution ideal. That said, security tools can often only ingest packets at 10G speeds, so faster connections will require a packet broker that can handle both 10G and 40/100G traffic. In terms of the packets themselves, traffic that is traversing environments, either between an application and the open internet or between the data center and the cloud, is often of particular interest to security teams as these can be likely entry points for an intruder. Unfortunately, this traffic can be particularly difficult to monitor.

For forensic analysis, security team investigations will require packet data that covers days or weeks of traffic between critical nodes. This means observability plans need to cover not just packet access, but capture and storage as well.

When setting up the monitoring infrastructure, several factors must also be weighed. At a basic level, the brokers, taps, capture devices, etc. all take up valuable rack space; consolidation, density and adequate topology planning are all critical. If data that's being monitored is sensitive or subject to privacy regulations, access to the visibility system and data must be controlled. The monitoring itself also creates a technical load on the network that must be accounted for (you don't want the monitoring itself to be the cause of performance issues).

Bridging the Visibility Gap

The appropriate monitoring infrastructure should be built around a subnet comprised of a load balancer, virtual packet broker and storage appliance, with equipment placed throughout the network at key points. One strategy to conserve space, save money and maximize resources is to use brokers as the "power strip" that distributes packets to firewalls and other security or NPM tools at the correct speeds. The subnet can further connect packet capture and storage to forensic tools for investigation, and feed NPM tools with real-time data to quickly triangulate network issues like latency, allowing IT teams to determine fault and, if necessary, negotiate with the cloud provider.

As mentioned, access to packet data in the public cloud is particularly difficult. The hyperscale providers all recognize the problems this lack of visibility causes, and each have taken different paths to solving it. AWS and GCP use similar mirroring approaches (VPC traffic (AWS) or packet (GCP) mirroring service). In basic terms, this traffic/packet mirroring duplicates network traffic to and from the client's applications and forwards it to cloud-native performance and security monitoring tool sets for assessment, and to capture devices for later analysis. This eliminates the need to deploy ad-hoc forwarding agents or sensors in each VPC instance for every monitoring tool. The raw data itself is not ready for analysis, and requires a virtual or cloud packet broker to ensure the right data gets to the right monitoring or security tools. That said, combining these mirroring options with virtual packet brokers can ultimately reduce cost, as a single stream only has to be mirrored once for the broker (as opposed to once per each NPM or security tool).

Solving the visibility challenge with Azure is different, and requires using what's known as "inline mode" on certain virtual packet brokers. This allows the packet broker itself to monitor subnet ingress and egress traffic to capture, pre-process, and deliver packet data in real-time to security, performance management, analytics, capture and other solutions.

Developing this visibility topology is complex; many companies may not have the necessary in-house staff to handle it, and may need to work with service providers or vendors on the design and set-up. But whether handled in-house or outsourced, keep tool and infrastructure sprawl in mind: a mixture of virtual and physical devices can save rack space in data centers, and leveraging the cloud for a consolidated management view of all packet broker and capture solutions can save considerable time.

Tech companies often take the slings and arrows that come with early adoption. But paying careful attention to visibility and monitoring allows organizations to better weather these issues by staying on-top of threats and ensuring the network is operating according to plan.

Nadeem Zahid is VP of Product Management & Marketing at cPacket Networks

The Latest

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...

40% of organizations deploying AI will implement dedicated AI observability tools by 2028 to monitor model performance, bias and outputs, according to Gartner ...

Until AI-powered engineering tools have live visibility of how code behaves at runtime, they cannot be trusted to autonomously ensure reliable systems, according to the State of AI-Powered Engineering Report 2026 report from Lightrun. The report reveals that a major volume of manual work is required when AI-generated code is deployed: 43% of AI-generated code requires manual debugging in production, even after passing QA or staging tests. Furthermore, an average of three manual redeploy cycles are required to verify a single AI-suggested code fix in production ...

Many organizations describe AI as strategic, but they do not manage it strategically. When AI plans are disconnected from strategy, detached from organizational learning, and protected from serious assumptions testing, the problem is no longer technical immaturity; it is a failure of management discipline ... Executives too often tell organizations to "use AI" before they define what AI is supposed to change. The problem deepens in organizations where strategy isn't well articulated in the first place ...

Across the enterprise technology landscape, a quiet crisis is playing out. Organizations have run hundreds, sometimes thousands, of generative AI pilots. Leadership has celebrated the proof of concept (POCs) ... Industry experience points to a sobering reality: only 5-10% of AI POCs that progress to the pilot stage successfully reach scaled production. The remaining 90% fail because the enterprise environment around them was never ready to absorb them, not the AI models ...

Today's modern systems are not what they once were. Organizations now rely on distributed systems, event-driven workflows, hybrid and multi-cloud environments and continuous delivery pipelines. While each adds flexibility, it also introduces new, often invisible failures. Development speed is no longer the primary bottleneck of innovation. Reliability is ...

Seeing is believing, or in this case, seeing is understanding, according to New Relic's 2025 Observability Forecast for Retail and eCommerce report. Retailers who want to provide exceptional customer experiences while improving IT operations efficiency are leaning on observability ... Here are five key takeaways from the report ...

Technology leaders across the federal landscape are facing, and will continue to face, an uphill battle when it comes to fortifying their digital environments against hostile and persistent threat actors. On one hand, they are being asked to push digital transformation ... On the other hand, they are facing the fiscal uncertainty of continuing resolutions (CR) and government shutdowns looming near and far. In the face of these challenges, CIOs, CTOs, and CISOs must figure out how to modernize legacy systems and infrastructure while doing more with less and still defending against external and internal threats ...

Reliability is no longer proven by uptime alone, according to the The SRE Report 2026 from LogicMonitor. In the AI era, it is experienced through speed, consistency, and user trust, and increasingly judged by business impact. As digital services grow more complex and AI systems move into production, traditional monitoring approaches are struggling to keep pace, increasing the need for AI-first observability that spans applications, infrastructure, and the Internet ...

If AI is the engine of a modern organization, then data engineering is the road system beneath it. You can build the most powerful engine in the world, but without paved roads, traffic signals, and bridges that can support its weight, it will stall. In many enterprises, the engine is ready. The roads are not ...