FRRouting on Juniper’s Advanced Forwarding Interface
Feb 6, 2018
Disaggregation and White-Boxes have gained prominence in Public Cloud and Telco Cloud environments. While the term “hardware-software disaggregation” has become a cliche in the industry, the true implementation has been sporadic due to the lack of well defined and open interfaces. So what exactly is “Hardware-Software disaggregation”? If it is just about a NOS with complex routing stack running on commodity hardware, it is nothing new. All known vendors have been shipping platforms with their NOS integrated on 3rd party ODMs with little to no customization. These are still considered “Closed” systems because the interaction between the control plane and hardware ASICs are proprietary. A true disaggregated platform will not only have the NOS integrated into commodity hardware, but do so in an environment that is “open” and extensible. Open interfaces allow the consumers of the platform (Carriers or Cloud Providers etc.) the flexibility and choice they desire from disaggregation.
“Hardware-Software disaggregation” has two key elements. There is a need for reliable and scalable Network Operating System (NOS) that can offer feature-rich control and management plane functionalities and provide open interfaces for interaction. On the other hand, commodity white/grey box hardware comes with different flavors of ASICs. For any NOS to program the underlying data plane, there is a need for well defined and open forwarding plane APIs. Below picture provides a high-level view of the key elements of a typical router.
White/Grey-Box: The “Hardware” part of disaggregation
The physical hardware comprises of the commodity components that make the router. Our focus for this document is the forwarding ASIC that performs the data plane functionality. This can be done by merchant ASIC (Broadcom, Cavium, Mellanox etc.) or custom ASIC that supports open interfaces.
FRRouting - The “Software” part of disaggregation: FRRouting is an open source routing stack that runs on Linux or Unix based platforms. It supports routing daemons ranging from IGP (OSPF, IS-IS, RIP), MPLS, BGP, PIM etc. FRRouting has its origins from the Quagga project. It is now part of Linux Foundation working on the routing stack needs for ISPs and Cloud providers.
Forwarding Software: The “Glue” that links the control plane to ASIC Forwarding software sits in the middle of ASIC and control plane NOS. This is the abstraction layer that masks the underlying ASIC SDKs from the control plane software. Advanced Forwarding Interface (AFI) is the open API that Juniper forwarding software provides to control, configure and manage the forwarding elements inside the ASICs. This effort is a result of Juniper Network’s decades of experience in building and deploying carrier-grade forwarding plane.
Juniper Networks and FreeRangeRouting (FRRouting) partnered to demonstrate disaggregated solution using open source software and open forwarding interfaces. The forwarding ASIC used for this effort is “vTrio”, a simulation of “Trio” ASIC that powers the Industry leading MX routers. We will go through the details of the solution in the subsequent sections.
Putting it together: The solution can be used in several ways and the use cases can vary. One example is to have a custom routing solution developed using the open source FRRouting stack. The customized routing stack can complement the existing NOS in extending the desired functionality. Different forwarding sandbox can be used for custom routing stack and existing NOS.
Another example is for a vendor to take FRRouting stack and build a commercial NOS with other elements added to it. For example, elements such as QoS, ACLs, Segment Routing enhancements etc. can be added on top of FRRouting stack. The resultant bundled NOS can be the AFI client and can operate on top of custom/merchant silicon.
A brief look into AFI
Recall that AFI provides forwarding APIs that is used by the control plane to program forwarding state. AFI system comprises of AFI client and AFI server. Client is written using AFI libs supplied by Juniper. Client talks to AFI server over gRPC transport. AFI server is responsible to initiate forwarding state and make corresponding entries in the ASIC. Figure below gives an overview of AFI system.
AFI programming model involves three main elements:
1. Forwarding Sandbox
A sandbox is essentially a walled area within forwarding path. Sandbox is allocated and provided by the underlying system (configured through JUNOS CLI). AFI Clients can program forwarding path within this sandbox.
Basic forwarding path elements in AFI are called nodes. Different nodes may vary in their complexity and connection options. Example of nodes are tables, trees, lookups and conditionals.
3. Node Entries
Node entries are the individual matches in a lookup node. Entries are the individual match elements for a particular container.
Advanced Forwarding Interface (AFI) APIs can be used to setup nodes and entries required for a forwarding topology. These APIs are provided as a library. Click here for more details on AFI. Click here for AFI client examples and documentation.
FRRouting Architecture We've seen in the previous sections how AFI makes it possible to implement a third party control plane on Juniper routers. Now we're going to see how to populate the routing table of an AFI sandbox with FRRouting-learned routes (e.g. OSPF/BGP routes).
First, it's important to understand the FRRouting's architecture and its southbound interface. FRRouting uses a modular design on which each routing protocol runs on a separate daemon. This way each protocol can be updated or restarted individually, and failures are not propagated from one protocol to the another. On top of that, there's the zebra daemon that serves as an intermediate layer between the routing daemons and the kernel. The figure above shows the architecture of FRRouting. The routing daemons communicate with zebra, which in turn communicates with the kernel by a series of different interfaces, depending on the underlying operating system.
The routing daemons send their best routes to zebra, which keeps them in the Routing Information Base (RIB), along with static and connected routes. The best routes of each prefix are computed which form the Forwarding Information Base (FIB) ‒ the routes that zebra installs in the kernel. Zebra also contains a plugin called the Forwarding Plane Manager (FPM) interface, which can be used to forward the FIB routes to an external component. This is particularly useful when the router has a fast-path forwarding engine outside the operating system kernel (e.g. hardware or specialized software).
Integration with AFI
Based on the fundamentals described in the previous sections, we wrote a simple AFI client that does the following things:
Use zebra's FPM interface to learn about its FIB routes and install them in the Junos's forwarding sandbox; Create tap interfaces to punt control packets to FRR so the routing protocols can work normally. In addition to that, create a host route (/32 and /128) in the AFI routing table for each connected address to ensure that packets destined to the router itself are punted appropriately.
In the sandbox, received packets whose destination address doesn’t match any route in the IP routing table are punted to the AFI client, which is turn writes the packets to the appropriate Linux tap interfaces, at which point they can be read by the FRR daemons. It's worth noting that non-IP packets are punted to the AFI client as well, which means that protocols that run at L2 like IS-IS can work normally on top of a AFI sandbox. Routing protocols that make use of multicast IP (e.g. OSPF) also work normally as long as we don’t make the mistake of adding multicast routes to the sandbox's routing table.
All routes learned by the routing daemons are sent to zebra, which relays them to our AFI client via the FPM interface. After resolving the next hops into MAC addresses, the received routes are installed in the sandbox routing table and are ready for use in the fast-path.
Below is a small demonstration of the AFI/FRR integration in action. We set up a network topology consisting of a vMX/FRR router connected to four other routers in a single OSPF backbone area. The goal of the test is very simple: use FRR to learn about the remote loopback addresses and install them in the AFI sandbox routing table.
After configuring all devices, we use the FRR CLI in the central router to confirm that OSPF has converged on the network. If we look at the Interface column of the command output, we can notice that all OSPF adjacencies are created on top of the tap interfaces as expected:
vmx-frr-vtysh# show ip ospf neighbor
Neighbor ID Pri State Dead Time Address Interface
10.0.255.0 1 Full/Backup 38.699s 18.104.22.168 tap0:22.214.171.124
10.0.255.1 1 Full/Backup 38.752s 126.96.36.199 tap1:188.8.131.52
10.0.255.2 1 Full/Backup 38.809s 184.108.40.206 tap2:220.127.116.11
10.0.255.3 1 Full/Backup 38.871s 18.104.22.168 tap3:22.214.171.124
Using the 'show ip fib ospf' command we can check the OSPF routes (the remote loopbacks) we learned and installed in the FIB:
vmx-frr-vtysh# show ip fib ospf
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel,
> - selected route, * - FIB route
O>* 10.0.255.0/32 [110/10000] via 126.96.36.199, tap0, 00:07:17
O>* 10.0.255.1/32 [110/10000] via 188.8.131.52, tap1, 00:07:17
O>* 10.0.255.2/32 [110/10000] via 184.108.40.206, tap2, 00:07:17
O>* 10.0.255.3/32 [110/10000] via 220.127.116.11, tap3, 00:07:17
Now, on the AFI client CLI, we can use the 'show routes' and 'show neighbors' commands to check all routes learned via the FPM interface and all neighbors learned via Netlink respectively:
Since the nexthop address of the OSPF learned routes is resolvable, the AFI client can install these routes in the sandbox routing table. Below we can see the output of the 'show tokens' command. This command displays all tokens installed in the AFI sandbox.
vmx-afi-client# show tokens
Token Type Description Next Token
1 port port 0, input 19
2 port port 1, input 19
3 port port 2, input 19
4 port port 3, input 19
5 port port 4, input -
6 port port 5, input -
7 port port 6, input -
8 port port 7, input -
10 port port 0, output -
11 port port 1, output -
12 port port 2, output -
13 port port 3, output -
14 port port 4, output -
15 port port 5, output -
16 port port 6, output -
17 port port 7, output -
18 punt punt to AFI client -
19 routing-table routing table - default VRF -
20 discard discard (drop) packets -
21 encap src 32:26:0a:2e:aa:f0, dst 32:26:0a:2e:cc:f0 10
22 encap src 32:26:0a:2e:aa:f2, dst 32:26:0a:2e:cc:f2 12
23 encap src 32:26:0a:2e:aa:f1, dst 32:26:0a:2e:cc:f1 11
24 encap src 32:26:0a:2e:aa:f3, dst 32:26:0a:2e:cc:f3 13
The port tokens and the punt token are created automatically as soon as the AFI client is initialized. Using a configuration file, we instruct the AFI client to enable only the first four ports by setting their "next token" to the Routing Table node. Only one routing table is configured, but more could be created on demand in order to support VRFs. The discard token is used to enable the installation of blackhole routes in the fast-path. A single discard token is shared by all blackhole routes in the sandbox. In a similar fashion, an encapsulation token is created for each neighbor learned via Netlink, and routes pointing to the same nexthop address can share the same encapsulation token. Counter nodes could also be implemented on a per-interface and/or per-route basis to provide statistics information.
Using a traffic generator we can validate that packets destined to the OSPF learned routes are routed in the fast-path (vMX vAsic). Running tcpdump on the tap interfaces confirms that only control packets (e.g. OSPF packets) are punted to the Linux network stack.
In summary, running the FRRouting control plane on a Juniper router was as simple as writing a small AFI client that does the integration using existing components like FRRouting’s FPM interface and Linux tap interfaces.
We have seen how two disparate systems can come together and solve a complex solution if the interfaces connecting them are open and flexible. FRRouting, an open source routing stack can be customized with a quick turnaround for augmenting the existing NOS functionality. At the same time, open forwarding interfaces allow the integration to be smooth and provide means for future extensions. Disaggregation is not just about the cost, but about choice. A platform built with merchant silicon ASICs doesn’t necessarily qualify as disaggregated and open. Along the same lines, a platform which has a custom-built component doesn’t have to be “closed.” The pivotal element of this exercise is to prove that open interfaces eliminate the need for custom and closed development of platforms. The time has come to choose best-in-class component without the fear of running into closed systems and vendor lock-in.