Archive
Recognized Expert Recognized Expert , Recognized Expert Recognized Expert Recognized Expert
Archive
Converged Storage in Overlay Networks
Mar 10, 2014

One of the great benefits of overlay technologies such as VMware NSX and Juniper Contrail is the ability to orchestration and provision compute, storage, and network with a single click of the mouse. Another great benefit is that all server traffic is encapsulated and only requires simple Layer 3 access in the network. However, with the fundamental change from Layer 2 to a Layer 3 network, how does this impact converged storage?

 
Perhaps a little clarification is in order because there are a few different storage protocols: NFS, CIFS, iSCSI, and FCoE. Out of these four options, only one requires Layer 2 access and that's FCoE. If you want to support FCoE and a simple Layer 3 network, you will have to leverage an Ethernet Fabric technology such as QFabric or Virtual Chassis Fabric. However if you use NFS, CIFS, or iSCSI, these protocols are able to be transported via Layer 3 with no problem. But how do you guarantee lossless Ethernet for Layer 3 traffic?
 
You're still in luck. The QFX5100 supports lossless Ethernet queues without enabling Virtual Chassis Fabric. The only downside is that the queues and classification have to be configured manually on each switch. However, the configuration is so simple it can be easily copy and pasted across the entire network or provisioned with Network Director.
 
PFE over Layer 3
 
In the illustration above, we have a simple 3-stage Clos architecture with four spines and 16 leaves. The assumption is that each switch is running eBGP with its own ASN; this creates an end-to-end Layer 3 network from leaf to leaf.
 
Now let's walk through how lossless Ethernet would work across this Layer 3 design. The end-to-end lossless path is denoted in a double-line. Let's assume that iSCSI traffic is sent from the server and received on the left-most leaf switch. The leaf switch will use a multi-field (MF) classifier and look for iSCSI traffic and put it into the lossless forwarding class. The data will be queued appropriately and transmitted to the left-most spine switch. During transmission the frame format is modified from IEEE 802.1p to DSCP. The traffic is received and classified on the left-most spine switch, but this time a behavior aggregate is used to classify the traffic. There's no need for the spine switches to install firewall filters to look for iSCSI traffic. We can trust the DSCP bits from the leaves and place the traffic into the appropriate forwarding classes. The spine switch forwards the traffic to the right-most leaf switch. The exact same process happens on the right-most leaf switch; the traffic is subject to classification through a behavior aggregate. The traffic is queued in to the appropriate forwarding class and transmitted to the NAS device. However, before the data is transmitted, the right-most leaf switch will modify the frame once again and convert DSCP to IEEE 802.1p and sent it off to the NAS device.
 
It's like having your cake and eating it, too. You get all of the benefits of overlay technologies with orchestration and provisioning, and in addition you can converge both data and storage onto the same network infrastructure.
 
Go use the QFX5100 to build an underlay for VMware NSX or Juniper Contrail as well as converge both data and storage onto the same networking infrastructure.