Skip to main content

Why Are NetOps Teams Struggling to Deliver on Their Network Automation Strategy?

Song Pang
NetBrain Technologies

Network automation remains a top challenge for enterprise IT departments. Despite years of effort from vendors and IT professionals to develop tools to reduce manual network management, results have been mixed. A recent report by Enterprise Management Associates (EMA) reveals that nearly 95% of organizations use a combination of do-it-yourself (DIY) and vendor solutions for network automation, yet only 28% believe they have successfully implemented their automation strategy.

Why is this mixed approach so popular if many engineers feel that their overall program is not successful?

The short answer is that each type of automation has different advantages and weaknesses. DIY automation, which involves engineers writing their own scripts for specific tasks or using open-source tools like Ansible, offers customization and cost-effectiveness but is hard to manage and scale, and relies almost completely on individual engineer's skillsets. On the other hand, commercial network automation products are often expensive, but provide stability and scalability and are easier to use.

So, where's the disconnect?

Why are NetOps teams struggling to deliver on their network automation strategy?

Should teams go all in on either a DIY or vendor solution?

Let's take a closer look.

First, a quick note — a successful network automation strategy depends on many factors, for the sake of time today we will focus on DIY vs. vendor solutions and related issues.

Benefits of DIY:

Capabilities align with the organization's specific network. With homegrown solutions, tools are tailor-made to fit the unique needs of a network environment. Vendor solutions can't ever be that customized. For organizations with unusual network architectures, this can be important.

Security and compliance requirements. DIY solutions can be designed to follow the particular security and compliance requirements for the business, such as GDPR, HIPAA, and PCI-DSS.

Cost savings. With DIY tools, you get exactly what you need for little to no cost (other than your engineer's time). When this works well, it means better operational efficiency, and complex processes are more streamlined.

Benefits of Using Vendor Solutions:

Scale. Vendor solutions are built to cover an entire network, handle large data loads, and integrate with other tools and data sources.

Security and compliance requirements. Hey wait a second, wasn't this one of the key drivers for using DIY? Yes, but it's a benefit here as well. Vendor products often come already compliant with certain security standards where making a DIY tool compliant would take too much work. Network teams often manage complex environments using commercial tools for particular needs and DIY tools for other tasks.

Platform requirements. Commercial solutions are more scalable and stable than DIY tools. While a homegrown automation solution might handle a few dozen changes really well, it will likely struggle to scale to thousands of changes.

Breadth of functionality. Vendor tools generally provide a broader range of features than DIY solutions, often addressing multiple issues from the get-go.

Despite all the benefits, each solution has its drawbacks. DIY solutions often struggle to scale up larger than the initial scenario they were written for, and it will take much more time and work to do this manually. They can also be slower than commercial tools and will lack multi-vendor support (unless the creator builds it). You also need network engineers who know enough scripting to write and manage these tools. If you don't have anyone with that skillset (or they leave the company), you're out of luck.

Drawbacks for vendor solutions include high upfront costs, lack of customization, and the training expenses associated with learning a new system. Cost and budget matters; the EMA report found a strong correlation between network automation success and significant budget investments. 80% of entirely successful organizations had well-funded projects, compared to only 57% of partially successful and 29% of partially failed organizations.

Many organizations are ultimately using each type of automation where it's needed. Rather than picking one, they're using both. Commercial network automation products have room for improvement, particularly in their customizability. The more they can adapt to fit each unique customer network, the more useful they will be. But the products aren't the real problem. The more important roadblocks I see (that are keeping the percentage of successful automation programs so low) are IT leadership problems. This includes difficulties gaining buy-in, establishing direction and ensuring commitment, as well as skill gaps, staff turnover and budget constraints.

Looking ahead, the future of automation involves an ecosystem of tools and products that must integrate seamlessly to create an effective solution for each unique environment. Organizations must maintain a repository of network intent and network state data to ensure adherence to design standards and security policies.

Song Pang is CTO at NetBrain Technologies

Hot Topics

The Latest

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

Many organizations assumed their infrastructure strategy was settled. It had been implemented, optimized and built into long-term plans. Recent changes in technology and vendor consolidation are forcing a second look. Cloud outages and licensing changes have exposed how much dependency exists on a small number of platforms. As a result, organizations are reevaluating whether those decisions still hold up under current conditions ...

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...

Why Are NetOps Teams Struggling to Deliver on Their Network Automation Strategy?

Song Pang
NetBrain Technologies

Network automation remains a top challenge for enterprise IT departments. Despite years of effort from vendors and IT professionals to develop tools to reduce manual network management, results have been mixed. A recent report by Enterprise Management Associates (EMA) reveals that nearly 95% of organizations use a combination of do-it-yourself (DIY) and vendor solutions for network automation, yet only 28% believe they have successfully implemented their automation strategy.

Why is this mixed approach so popular if many engineers feel that their overall program is not successful?

The short answer is that each type of automation has different advantages and weaknesses. DIY automation, which involves engineers writing their own scripts for specific tasks or using open-source tools like Ansible, offers customization and cost-effectiveness but is hard to manage and scale, and relies almost completely on individual engineer's skillsets. On the other hand, commercial network automation products are often expensive, but provide stability and scalability and are easier to use.

So, where's the disconnect?

Why are NetOps teams struggling to deliver on their network automation strategy?

Should teams go all in on either a DIY or vendor solution?

Let's take a closer look.

First, a quick note — a successful network automation strategy depends on many factors, for the sake of time today we will focus on DIY vs. vendor solutions and related issues.

Benefits of DIY:

Capabilities align with the organization's specific network. With homegrown solutions, tools are tailor-made to fit the unique needs of a network environment. Vendor solutions can't ever be that customized. For organizations with unusual network architectures, this can be important.

Security and compliance requirements. DIY solutions can be designed to follow the particular security and compliance requirements for the business, such as GDPR, HIPAA, and PCI-DSS.

Cost savings. With DIY tools, you get exactly what you need for little to no cost (other than your engineer's time). When this works well, it means better operational efficiency, and complex processes are more streamlined.

Benefits of Using Vendor Solutions:

Scale. Vendor solutions are built to cover an entire network, handle large data loads, and integrate with other tools and data sources.

Security and compliance requirements. Hey wait a second, wasn't this one of the key drivers for using DIY? Yes, but it's a benefit here as well. Vendor products often come already compliant with certain security standards where making a DIY tool compliant would take too much work. Network teams often manage complex environments using commercial tools for particular needs and DIY tools for other tasks.

Platform requirements. Commercial solutions are more scalable and stable than DIY tools. While a homegrown automation solution might handle a few dozen changes really well, it will likely struggle to scale to thousands of changes.

Breadth of functionality. Vendor tools generally provide a broader range of features than DIY solutions, often addressing multiple issues from the get-go.

Despite all the benefits, each solution has its drawbacks. DIY solutions often struggle to scale up larger than the initial scenario they were written for, and it will take much more time and work to do this manually. They can also be slower than commercial tools and will lack multi-vendor support (unless the creator builds it). You also need network engineers who know enough scripting to write and manage these tools. If you don't have anyone with that skillset (or they leave the company), you're out of luck.

Drawbacks for vendor solutions include high upfront costs, lack of customization, and the training expenses associated with learning a new system. Cost and budget matters; the EMA report found a strong correlation between network automation success and significant budget investments. 80% of entirely successful organizations had well-funded projects, compared to only 57% of partially successful and 29% of partially failed organizations.

Many organizations are ultimately using each type of automation where it's needed. Rather than picking one, they're using both. Commercial network automation products have room for improvement, particularly in their customizability. The more they can adapt to fit each unique customer network, the more useful they will be. But the products aren't the real problem. The more important roadblocks I see (that are keeping the percentage of successful automation programs so low) are IT leadership problems. This includes difficulties gaining buy-in, establishing direction and ensuring commitment, as well as skill gaps, staff turnover and budget constraints.

Looking ahead, the future of automation involves an ecosystem of tools and products that must integrate seamlessly to create an effective solution for each unique environment. Organizations must maintain a repository of network intent and network state data to ensure adherence to design standards and security policies.

Song Pang is CTO at NetBrain Technologies

Hot Topics

The Latest

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

When most people think about cybersecurity, they picture firewalls, encryption, and access controls — technical tools designed to protect systems and data. But beneath the technology lies a deeper set of principles about trust, decision-making, and resilience ... The best leaders don't eliminate risk. They manage it intelligently. And in many ways, cybersecurity offers a surprisingly useful playbook for doing exactly that ...

Many organizations assumed their infrastructure strategy was settled. It had been implemented, optimized and built into long-term plans. Recent changes in technology and vendor consolidation are forcing a second look. Cloud outages and licensing changes have exposed how much dependency exists on a small number of platforms. As a result, organizations are reevaluating whether those decisions still hold up under current conditions ...

Edge AI is strategically embedded in core IT and infrastructure spending across industries, according to the 2026 Edge AI Survey from ZEDEDA. The research shows that 83% of C-suite and IT executive respondents say edge AI is important to their core business strategy ...

As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...

For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...

I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...

Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...

80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...