The Advent of Flexible Overlays

By Erdem posted 06-21-2019 15:24

Recommend

Importance of Overlays

Network overlay technologies have played a key role in modern data centers by improving the efficiency, agility, scalability and manageability of application deployments. Cloud providers are rapidly innovating on cloud connect models to deploy new applications that require specific network topologies, without modifying the underlying physical networks. As overlay methodologies actively evolve, it’s imperative that data center edge routers adopt a new paradigm - to implement overlays that are flexible and can adapt to the changing demands expeditiously.

Background of JUNOS Overlays

So far, overlays or tunnels in JUNOS have been fixed virtual pipes that existed between two end-points in the network. These tunnels can be broadly categorized into two kinds – static interface-based tunnels and dynamic protocol-driven tunnels.

A static interface-based tunnel is configured on an interface that is anchored to a specific packet-forwarding-engine (PFE) on a line-card. Since all tunnel traffic has to pass through the anchor PFE, it is possible to accurately account, filter or police the traffic at a single point, but the tunnel throughput is limited to the bandwidth on the anchor PFE.

After the explosion of hyper-scale data-centers, geographically distributed workloads and SDN controller driven public-cloud deployments, tunnel scale and performance requirements increased exponentially, which lead us to dynamic tunnels.

A dynamic tunnel is established without static configuration, by using route-resolution logic, and is not anchored to any forwarding element. Any PFE can originate and/or terminate a dynamic tunnel, thus providing unlimited aggregate-bandwidth to the tunnel. Due to the absence of any configuration element, and distributed nature of tunnel traffic across many PFEs, it is a challenge to attach and implement features like policing on this traffic. In most cloud deployments, tunnel traffic is traversing the provider’s own backbone, strict traffic control may not be required, and traffic control is managed at the customer handoff points.

While both static and dynamic tunnels have been successfully used to deploy overlay networks, they can still be limiting for present-day deployments in-terms of flexibility, programmability, scale and performance.

Flexible Overlays

Flexible overlays are an API based approach to create overlays, designed to address the rapidly evolving demands from data-center edge routers. Not only do the types of overlays used to backhaul traffic to compute servers evolve, but the parameters used to signal the overlay also change and they vary from one customer to another.

These are some of the salient features of flexible overlays:

Flexible overlays are designed to be API-centric and allow controller applications program overlays in a flexible fashion. Using the flexible API, a route- prefix can be associated with an overlay, and programmed in any routing-instance
They are flexible, as the user has full control on all tunnel parameters and can change the tunnels on-the-fly, without disrupting traffic
Highly scalable, to several million tunnels
Have unlimited tunnel bandwidth as they can use all forwarding units to forward tunnel traffic
Ephemeral and have no Junos CLI or persistent state across Junos modules
Instantaneous and require no setup time
Asymmetric as we can decouple upstream and downstream tunnel state, with a possibility to aggregate downstream tunnel state using prefix aggregation
Combine the best of both static and dynamic tunnels, by including a feature context to attach any accounting profiles for tunnel traffic

Flexible Overlays API Model

The API data-model for defining a flexible overlay is generic and extensible. A route object represents a route-prefix in any route-table, and it can specify a flexible tunnel-profile. The tunnel-profile is a generic object that supports any type of overlay; it has a tunnel-attributes object and a feature-context object. The tunnel-attributes object specifies overlay parameters thus allowing full control over the tunnel headers. The feature-context can specify a flexible interface to allow configuration of additional features on the overlay. The tunnel-profile can be a standalone object to support decapsulation-profiles independent of a route-prefix.

Using this API data-model, the JET API is defined as shown below. The grey objects are existing RIB service APIs, which can be used to specify a route-prefix, nexthop and route-attributes. The objects in green are the new tunnel service objects which specify a tunnel-profile name, type, feature-context (Interface) and tunnel-attributes. The green objects can exist on their own, without a route reference, to program tunnels independently for decapsulation or to allow sharing by several routes.

Flexible Bindings and Parameters

Flexible overlay APIs allow tunnel binding at a granular level, with 1:1 mapping between a route and a tunnel-profile. Within a VRF, each route can specify a different tunnel-type and/or tunnel-parameters; which is far more flexible than having all prefixes in a VPN use one tunnel to span the overlay network.

In traditional tunnels, tunnel parameters are fixed by the network OS, lacked fine-grain control and are bound to Junos CLI or protocol mechanics. With flexible overlays, the end-user has full control on all the tunnel header parameters. This will allow the usage of reserved or non-standardized bits or flags in a customized fashion as suitable for that overlay. For example, in the case of VxLAN overlays, the UDP header source-port is typically picked by the encapsulating router based on a computation of the payload data to achieve entropy. Using flexible API, the user can further tune the UDP source-port to fall within a range specified by the user. Flexible VxLAN API also allows the user to pick any UDP destination-port instead of the well-known ports for VxLAN, thus giving more flexibility to work outside of the confines of standards. In order to go beyond the confines of standards, we will need support from the management device or controller to make sure the end-points negotiate what ports to use to exchange information on, and this is the biggest value of APIs.

Instantaneous and On-the-Fly

Traditional tunnels are fixed virtual pipes that required setup time. Once a tunnel is configured, control protocols are involved to bring the tunnel up, sometimes using keepalive mechanisms, before traffic is forwarded on the tunnel. Once a tunnel is setup, any change to the tunnel is catastrophic and caused significant packet-loss until the new tunnel is up and routes are updated, this means a tunnel cannot be used while being setup and the tunnel cannot be changed while in use.

Flexible overlays are created instantaneously and require no setup time; they are ephemeral as they are created using an API, and not via Junos CLI. They are designed to be lean and have minimal footprint or persistent state across Junos modules. The route -> tunnel mapping is instantaneously installed in the data-path to be used by traffic right away.

Flexible overlays can be updated on-the-fly without disrupting traffic; this includes changing the tunnel parameters as well as the overlay type itself. This is quite powerful when the user wants to deploy new overlays, as the API allows selective migration of services on a per route basis.

This is in contrast to the events that churn the system when traditional tunnels are created or updated, as shown below.

Asymmetric

Flexible APIs allow the user to decouple encapsulation and decapsulation states for a tunnel, thus allowing asymmetric overlay deployments. Furthermore, the decapsulation tunnel profiles can be aggregated using prefix-aggregation on the tunnel source-IP address, thus allowing simplified controller state management when the servers hosting customer VMs are within a small set of prefix ranges.

Feature Context

Flexible overlays combine the best of traditional static interface-based tunnels and dynamic tunnels by allowing the user to specify a feature context for each overlay. This feature context is in the form of a flexible-tunnel-interface or FTI, which is a first-class Junos interface that can be configured using the CLI, to enable features like firewall filters, sampling, mirroring etc. This capability allows the user to execute various features on traffic sent or received on flexible overlays.

Scale and Performance

High scale is the primary design goal for flexible overlays, and this is accomplished by consuming minimal resources and adopting a lean footprint model. While the exact scale is dependent on the type of overlay, flexible overlays allow millions of tunnels to be deployed on a single Edge router.

A flexible overlay has unlimited tunnel bandwidth, as all forwarding units can be used to forward traffic in upstream or downstream directions.

Future

As a first offering, the flexible overlays support VxLAN-IPv4 and VxLAN-IPv6 overlays, to carry IPv4 or IPv6 underlay traffic. Using the API data-model, newer encapsulations can be incorporated seamlessly, with the potential of supporting MPLS, GRE, UDP, NvGRE, Geneve, SRv6 and SRv6+.

#ExpertAdvice

Blogs