My “Metropolitan Area L2 Access Network Design” Explained
Aug 10, 2015
This is a guest blog post by Andrea Odorizzi, who received the most votes for his network design in our June Network Design Challenge. In his blog article he describes his network design in detail and the reasoning behind it.
My company is a local government owned ISP and the aim of the network design is to unify all of the public administration offices of a small town into a single self-owned network infrastructure instead of multiple ones, in order to minimize the TCO by exploiting economies of scale. The design, already implemented in three cities of Emilia-Romagna region here in Italy, relies on at least 2 Points of Presence (POP), just connected to a larger MPLS network and whose uplinks will become the geographical connections of the MAN.
The physical topology, an SDH-like ring one, is the result of a call for bids and introduces some critical reliability issues: the generic Customer Premise Equipment (CPE) reachability from the POPs relies on other CPEs behaviour; some of them (healthcare services) have challenging SLA and cannot rely on devices in offices closed after the 17:00PM.
Obviously the network elements had to be inexpensive, with extremely simple and regular configuration in order to simplify troubleshooting analysis; nevertheless the CPE had to be connected with GE links toward MAN core, deploying at least 3 levels of reliability: Best effort, Redundant uplinks and, finally, Redundant CPE with redundant uplinks.
Customers (local government offices, healthcare services, schools...) essentially need MPLS VPN services between local offices or local and remote ones; leave unchanged the MPLS VPN CE-PE links configuration was a constrain: POPs were pure layer 2 equipment (PE-CE links with static routing only) therefore the MAN design had to take into account that the core MPLS network devices expect a switch behavior.
Well, the MAN's equipment had to be inexpensive like a switch, wire-speed like switches and must have a switch behavior... Use switching-only stuff seems to be the egg of Columbus. On top of the normal security policies (storm control, etc...) extra precautions are needed in the loop avoidance protocols selection (good protocols are the ones that cannot be affected by customers such as Virtual Chassis, ERPS, LAG and RTG).
Reliability and scalability issues are related to a heavy sharing ratio of the fiber optics media, i.e., each link has to be shared between different [kind of] customers. Someone told me that problem that seems too big have to be faced from another point of view... inexpensive passive OADM could be used to multiplex at least 8 CWDM links into a single fiber one: a smart CWDM lambda wave plan, CWDM coloured SFPs and passive CWDM stuff will contribute to achieve a good customer decoupling and classification. Nevertheless, increasing the amount of [CWDM] link leads to a deliverable bandwidth enhancement.
This diagram explains how CWDM links and loop avoidance protocols could be exploited. A little focus over EX-4200 distributed Virtual Chassis configuration: it allows 10GE VC links, to establish a distributed LAG toward MX-series MPLS core and, at last, CWDM color reuses in each network section bounded by a couple of VC member.
10GE core links should avoid congestion for at least 3-5 years (around 50 initial customers, most of them with 5-10M bandwidth demands) but ERPS RPL owner locations will realize a first trivial TE solution for traffic flows originated by bandwidth greedy customers with relaxed SLA (such as Schools).
Finally, a workaround that could be used in order to overcome an ERPS limitation (a vlan trunked in a data channel cannot be trunked in the data channel of another ring): external tag could be swapped by looping 2 GE access ports in each EX-4200 VC member.