The challenge today for network operations (NetOps) is how to maintain and evolve the network while demand for network services continues to grow. Software-Defined Networking (SDN) promises to make the network more agile and adaptable. Various solutions exist, yet most are missing a layer to orchestrate new features and policies in a standardized, automated and replicable manner while providing sufficient customization to meet enterprise-level requirements.
NetOps is often working with wide area networks ("WANs") that are geographically diverse, use a plethora of technologies from different services providers and are feeling the strain from increasing use of video and cloud application services. Hybrid WAN architectures with advanced application-level traffic routing are of particular interest. They combine the reliability of private lines for critical business applications with the cost-effectiveness of broadband/Internet connectivity for non-critical traffic.
Here's the issue: many of the network management tools available today are insufficient to deploy such architectures at scale over the existing network. Most of them still apply blocks of configuration data to network devices to enable features that in turn enable an overall network policy. To allow adjustment of configuration data to address differences in hardware and OS/firmware levels, those scripts are using "wildcards" replacing certain configuration data. These scripts are heavily tested, carefully curated and subject to stringent change management procedures. The tiniest mistake can bring a network down, resulting in potentially disastrous business losses.
NetOps teams are seeing first-hand how inadequate this approach is. As they deploy hybrid WAN architectures and application-specific routing, network operations teams are experiencing the limits to this approach. Even if the existing hardware already supports all the functionality required, existing network configurations that reflect past user requirements are rarely well understood. As each business unit is asking for specific requirements to ensure that their applications run optimally on the network, networks need to be continuously updated and optimized. Such tasks range from a simple adjustment of the configuration parameters to more complex changes of the underlying network architecture, such as removing and installing upgraded circuits, replacing hardware or even deploying new network architectures.
In these instances, senior network architects must be heavily relied upon to determine potential risk of unintentional consequences on the existing network, but waiting for the next change maintenance window may no longer be an acceptable option. Businesses are not concerned with the details; they want the networks to simply "work."
Moving Forward: the Ideal vs. the Real
What needs to happen in order for the network to simply work? Traditional network management tools are mature and well understood. Network architects and implementation teams are familiar with them, including all of the limitations and difficulties, and any potential change of these tools is immediately vetted against the additional learning curve required vis-à-vis potential benefits in managing the network.
An ideal situation would be one in which the network policies are defined independently of implementation or operational concerns. It starts with mapping of the required functionality into a logical model, assembling these models into one overall network policy, verifying interdependencies and inconsistencies, and deploying and maintaining them consistently throughout the network life cycle.
The current situation is less than ideal, though. The industry has launched a variety of activities to improve network management, but those initiatives are still maturing. For example, YANG is a data modeling language for the NETCONF network configuration protocol. OpenStack Networking (Neutron) is providing an extensible framework to manage networks and IP addresses within the larger realm of cloud computing, focusing on network services such as intrusion detection systems (IDS), load balancing, firewalls and virtual private networks (VPN) to enable multi-tenancy and massive scalability. But neither approach can proactively detect interdependencies or inconsistencies, and both require network engineers to dive into programming, for example, to manage data entry and storage.
It makes sense, then, that some vendors are offering fully integrated solutions, built on appliances managed through a proprietary network management tool. This model allows businesses to deploy solutions quickly, at the cost of additional training, limited capability for customization and new hardware purchases.
In order for transformation to occur, the focus of new network management capabilities needs to be on assembling complete network policies from individual device-specific features, detecting inconsistencies and dependencies, and allowing deployment and ongoing network management. Simply updating wildcards in custom configuration templates and deploying them onto devices is no longer sufficient.
As needs and technologies shift and evolve, network architectures or routing protocol changes may need to be changed on live production networks. Managing such changes at large scale is difficult or even infeasible. This is especially true in large organizations where any change will always have to be validated by e.g. security. This creates unacceptable delays for implementation.
To find out more about solving these network operations challenges, read Best Practices for Modeling and Managing Today's Network - Part 2
Dr. Stefan Dietrich is VP of Product Strategy at Glue Networks.
The Latest
The OpenTelemetry End-User SIG surveyed more than 100 OpenTelemetry users to learn more about their observability journeys and what resources deliver the most value when establishing an observability practice ... Regardless of experience level, there's a clear need for more support and continued education ...
A silo is, by definition, an isolated component of an organization that doesn't interact with those around it in any meaningful way. This is the antithesis of collaboration, but its effects are even more insidious than the shutting down of effective conversation ...
New Relic's 2024 State of Observability for Industrials, Materials, and Manufacturing report outlines the adoption and business value of observability for the industrials, materials, and manufacturing industries ... Here are 8 key takeaways from the report ...
For mission-critical applications, it's often easy to justify an investment in a solution designed to ensure that the application is available no less than 99.99% of the time — easy because the cost to the organization of that app being offline would quickly surpass the cost of a high availability (HA) solution ... But not every application warrants the investment in an HA solution with redundant infrastructure spanning multiple data centers or cloud availability zones ...
The edge brings computing resources and data storage closer to end users, which explains the rapid boom in edge computing, but it also generates a huge amount of data ... 44% of organizations are investing in edge IT to create new customer experiences and improve engagement. To achieve those goals, edge services observability should be a centerpoint of that investment ...
The growing adoption of efficiency-boosting technologies like artificial intelligence (AI) and machine learning (ML) helps counteract staffing shortages, rising labor costs, and talent gaps, while giving employees more time to focus on strategic projects. This trend is especially evident in the government contracting sector, where, according to Deltek's 2024 Clarity Report, 34% of GovCon leaders rank AI and ML in their top three technology investment priorities for 2024, above perennial focus areas like cybersecurity, data management and integration, business automation and cloud infrastructure ...
While IT leaders are preparing organizations for accelerated generative AI (GenAI) adoption, C-suite executives' confidence in their IT team's ability to deliver basic services is declining, according to a study conducted by the IBM Institute for Business Value ...
The consequences of outages have become a pressing issue as the largest IT outage in history continues to rock the world with severe ramifications ... According to the Catchpoint Internet Resilience Report, these types of disruptions, internet outages in particular, can have severe financial and reputational impacts and enterprises should strongly consider their resilience ...
Everyday AI and digital employee experience (DEX) are projected to reach mainstream adoption in less than two years according to the Gartner, Inc. Hype Cycle for Digital Workplace Applications, 2024 ...
When an IT issue is not handled correctly, not only is innovation stifled, but stakeholder trust can also be impacted (such as when there's an IT outage or slowdowns in performance). When you add new technology investments and innovations into the mix, you have a recipe for disaster ...