There is a huge emphasis in the networking community around automation and validation. Network automation builds on the work done for server automation. The solutions are more mature and and the terminology describing the solutions and tasks are well defined. Terms like “idempotent,” “task-based,” “state-based,” “agentless,” etc. are well understood.

Network validation, however, does not have a nuanced vocabulary. The general term “network validation” gets used to refer to a number of disparate activities, and specific terms get used by different engineers to mean different things. This lack of nuance hinders the communication and collaboration required to advance network validation technology. That, in turn, harms the adoption of network automation. It is too risky to use automation without effective validation; a single typo can bring down the entire network within seconds.

In this post, we outline different dimensions of network validation and hope to start a conversation about developing a precise vocabulary. We will discuss the what, when and how of network validation.

A few decades ago, car odometers were designed to roll over to zero after 99,999 miles because it was rare for cars to last that long. But today cars come with a warranty for 100,000 miles because it is rare for cars to not last that long. This massive reliability improvement has come about despite the significantly higher complexity of modern cars. Cars have followed the arc of many engineering artifacts, where human ingenuity brought them to their initial working form and then robust engineering techniques made them work well.

The computer hardware and software domains have also invested heavily in robust engineering techniques to improve reliability. One domain where reliability improvements have lagged is computer networking, where outages and security breaches that disrupt millions of users and critical services are all too common. While there are many underlying causes for these incidents, studies have consistently shown that the vast majority are caused by errors in the configuration of network devices. Yet engineers continue to manually reason about the correctness of network configurations. While the original Internet was an academic curiosity, today’s networks are too critical for businesses and society, and also too complex—they span the globe and connect billions of endpoints—-for their correctness to be left (solely) to human reasoning.

When you compare software and network engineering trends at a high level, the contrast is striking. Application development has become remarkably agile, robust and responsive, while the networks that carry those apps have not. They continue to be slow to evolve and prone to error. The difference is tools.

Software engineers have leveraged a suite of tools to rapidly respond to changing business needs, accelerate development and improve reliability. Network engineers need to follow suit. The tools they need are now available.

The growing scale and complexity of today’s networks has outpaced network engineers’ ability to reason about their correct operation. As a consequence, misconfigurations that lead to downtime and security breaches have become all too common.

Network-wide specification languages help bridge the abstraction gap between the intended high-level policies of a network and its low-level configuration. A compiler automatically generates the corresponding low-level configurations. This approach is analogous to the trend in software engineering over the last several decades, which has led to ever-higher levels of abstraction and has been a huge boon for the software industry:  Imagine writing today's complex software in machine code!

In this post we will discuss the various attempts in industry and academia to define a higher level specification language for networks, while diving deeper in Propane; an intra- and inter-domain routing policy framework.

The inherent complexity in today's networks means humans are simply incapable of reasoning about its correctness. Yet network engineers are asked to do so on a daily basis. It is no surprise then that we consistently see headlines such as “Comcast Suffers Outage Due to Significant Level 3 BGP Route Leak” or “Google accidentally broke Japan's Internet”. Fortunately, recent advances in network validation, specifically control plane validation, can provide strong guarantees on the correctness of network configuration and completely prevent such errors.

Using network validation tools like Batfish, network engineers can make configuration changes without taking down the Internet, making headlines like those above a thing of the past.

At Future:NET 2017, our CEO Ratul Mahajan gave the keynote presentation about how we can help network engineers and operators make their networks highly agile, reliable, and secure by adapting proven approaches employed by hardware and software engineers. In his keynote, Ratul introduced the concept of the new network engineering workflow inspired by capabilities used by hardware and software engineers.

Intentionet © 2019