100G is Increasingly Popular, and It's Creating a Host of Management Challenges
November 02, 2020

Nadeem Zahid
cPacket Networks

Share this

Name virtually any technology trend — digital transformation, cloud-first operations, datacenter consolidation, mobility, streaming data, AI/ML, the application explosion, etc. — they all have one thing in common: an insatiable need for higher bandwidth (and often, low latency). The result is a steady push to move 10Gbps and 25Gbps network infrastructure toward the edge, and increasing adoption of 100Gbps in enterprise core, datacenter and service provider networks.

Initial deployments focused on backbone interconnects (historically a dual-ring failover topology; more recently mesh connectivity), primarily driven by north-south traffic. Data center adoption has followed, generally in spine-leaf architecture to handle increases in east-west connections.

Beyond a hunger for bandwidth, 100G is having a moment for several reasons: a commodity-derived drop in cost, increasing availability of 100G-enabled components, and the derivative ability to easily break 100G into 10/25G line rates. In light of these trends, analyst firm Dell'Oro expects 100G adoption to hit its stride this year and remain strong over the next five years.

Nobody in their right mind disputes the notion that enterprises and service providers will continue to adopt ever-faster networks. However, the same thing that makes 100G desirable — speed — conspires to create a host of challenges when trying to manage and monitor the infrastructure. The simple truth is that the faster the network, the more quickly things can go wrong. That makes monitoring for things like regulatory compliance, load balancing, incident response/forensics, capacity planning, etc., more important than ever.

At 10G, every packet is transmitted in 67 nanoseconds; at 100G that increases tenfold, with packets flying by at 6.7 nanoseconds. And therein lies the problem: when it comes to 100G, traditional management and monitoring infrastructure can't keep up.

The line-rate requirement varies based on where infrastructure sits in the monitoring stack. Network TAPs must be capable of mirroring data at 100G line speeds to packet brokers and tools. Packet brokers must handle that 100G traffic simultaneously on multiple ports, and process and forward each packet at line rate to the tool rail. Capture devices need to be able to achieve 100G bursts in capture-to-disk process. And any analysis layer must ingest information at 100G speeds to allow correlation, analysis and visualization.

Complicating matters are various "smart" features, each of which demand additional processing resources. As an example, packet brokers might include filtering, slicing and deduplication capabilities. If the system is already struggling with the line rate, any increased processing load degrades performance further.

For any infrastructure not designed with 100G in mind, the failure mode is inevitably the same: lost or dropped packets. That, in turn, results in network blind spots. When visibility is the goal, blind spots are — at the risk of oversimplification — bad. The impact can be incorrect calculations, slower time-to-resolution or incident response, longer malware dwell time, greater application performance fluctuation, compliance or SLA challenges and more.

Lossless monitoring requires that every part of the visibility stack is designed around 100G line speeds. Packet brokers in particular, given their central role in visibility infrastructure, are a critical chokepoint. Where possible, a two-tier monitoring architecture is recommended with a high-density 10/25/100G aggregation layer to aggregate TAPs and tools, and a high-performance 100G core packet broker to process and service the packets. While upgrades are possible, beware as they add cost yet may still not achieve true 100G line speeds when smart features centralize and share processing requirements at the core. Newer systems with a distributed/dedicated per-port processing architecture (versus shared central processing) are specifically designed to accommodate 100G line rates and eliminate these bottlenecks.

The overarching point is that desire for 100G performance cannot override the need for 100G visibility, or the entire network can suffer as a result. The visibility infrastructure needs to match the forwarding infrastructure. While 100G line rates are certainly possible with the latest monitoring equipment and software, IT teams must not assume that existing network visibility systems can keep up with the new load.

Nadeem Zahid is VP of Product Management & Marketing at cPacket Networks
Share this

The Latest

September 23, 2021

The Internet played a greater role than ever in supporting enterprise productivity over the past year-plus, as newly remote workers logged onto the job via residential links that, it turns out, left much to be desired in terms of enabling work ...

September 22, 2021

The world's appetite for cloud services has increased but now, more than 18 months since the beginning of the pandemic, organizations are assessing their cloud spend and trying to better understand the IT investments that were made under pressure. This is a huge challenge in and of itself, with the added complexity of embracing hybrid work ...

September 21, 2021

After a year of unprecedented challenges and change, tech pros responding to this year’s survey, IT Pro Day 2021 survey: Bring IT On from SolarWinds, report a positive perception of their roles and say they look forward to what lies ahead ...

September 20, 2021

One of the key performance indicators for IT Ops is MTTR (Mean-Time-To-Resolution). MTTR essentially measures the length of your incident management lifecycle: from detection; through assignment, triage and investigation; to remediation and resolution. IT Ops teams strive to shorten their incident management lifecycle and lower their MTTR, to meet their SLAs and maintain healthy infrastructures and services. But that's often easier said than done, with incident triage being a key factor in that challenge ...

September 16, 2021

Achieve more with less. How many of you feel that pressure — or, even worse, hear those words — trickle down from leadership? The reality is that overworked and under-resourced IT departments will only lead to chronic errors, missed deadlines and service assurance failures. After all, we're only human. So what are overburdened IT departments to do? Reduce the human factor. In a word: automate ...

September 15, 2021

On average, data innovators release twice as many products and increase employee productivity at double the rate of organizations with less mature data strategies, according to the State of Data Innovation report from Splunk ...

September 14, 2021

While 90% of respondents believe observability is important and strategic to their business — and 94% believe it to be strategic to their role — just 26% noted mature observability practices within their business, according to the 2021 Observability Forecast ...

September 13, 2021

Let's explore a few of the most prominent app success indicators and how app engineers can shift their development strategy to better meet the needs of today's app users ...

September 09, 2021

Business enterprises aiming at digital transformation or IT companies developing new software applications face challenges in developing eye-catching, robust, fast-loading, mobile-friendly, content-rich, and user-friendly software. However, with increased pressure to reduce costs and save time, business enterprises often give a short shrift to performance testing services ...

September 08, 2021

DevOps, SRE and other operations teams use observability solutions with AIOps to ingest and normalize data to get visibility into tech stacks from a centralized system, reduce noise and understand the data's context for quicker mean time to recovery (MTTR). With AI using these processes to produce actionable insights, teams are free to spend more time innovating and providing superior service assurance. Let's explore AI's role in ingestion and normalization, and then dive into correlation and deduplication too ...