Industry Solutions and Trends
Technology is more than just networking and Juniper experts share their views on all the trends affecting IT
Industry Solutions and Trends
The Airbender: 24Tbps@12kW and 8Tbps@5kW Core Routers
05.05.15

                          Fig0.png

While attending OFC two weeks ago, I found myself in quite a few meetings discussing “Are physical routers even needed anymore?” “Is hardware dead and everything will be SDN?” Meanwhile, in the Juniper booth stood the world’s most powerful core router on the planet, the PTX5000 supporting 24Tbps of capacity and the PTX3000 supporting 8Tbps of capacity effortlessly whirling air over this mind-boggling performance.

 

I was wondering if they realized the gravity of the questions. As the head of product management at Juniper for core router, Packet Optical, and the WAN SDN controller, the predication of hardware’s impending demise is a topic that interests me greatly. In a world that continues to march forward with technological advancements, we easily may take for granted the engineering marvels all around us. Juniper’s engineering team push the boundaries of what was thought was impossible, and made it possible by taking a holistic architecture approach to building the most powerful, efficient, greenest and deployable core routers.

           

I would like share some insights about how we built a 24Tbps capable core routers with the PTX5000, 5 times more efficient than comparable solutions in the market; and the 8Tbps capable PTX3000, 91% smaller vs. the nearest competitor.

 

Juniper core router power efficiency storied history

 

Before I dive into details, let us take a quick look what are the PTX5000 and PTX3000 core routers.

 

PTX5000= {IP Core Router, 36RU (2/3 rack), 3Tbps per Slot, 24Tbps per System, typical power under 12kW, maximum power under 18kW, NEBS compliance}

 

PTX3000= {IP Core Router, 22RU (half rack), 270mm in depth, 1Tbps per Slot, 8Tbps per System, typical power under 5kW, maximum power under 7kW, NEBS compliance}

 

The PTX series of routers are the result of several years’ worth of meticulous engineering effort. It certainty didn’t happen overnight; the process has actually taken close to 20 years of continuous innovation to build. Figure 1 shows how far ahead the Juniper is compared to its closest competitor in terms of power efficiency and bandwidth efficiency in the core routing market.

 

                                      Fig1.0.png

                                      Fig1.1.png

 

                                        Figure 1. PTX bandwidth and power efficiency over closest competitor

 

A new core router design philosophy

 

It all started four years ago. Juniper was designing its next-generation core router, culminating into the PTX series of routers you see today. There were quite a few fundamental issues that have to be considered, For example, how and where was the network evolving? What were the design optimization points around silicon, memory, packaging, and power? What would customer’s value in a core router besides speed and feeds? How would these products fit into the overall service provider architecture? All these questions led us to rethink the basic design philosophies of networking.

 

The network core is all about scale, speed, carrier grade reliability, resiliency, and longevity for long product depreciation and lifecycle.  When people think about capacity, it is generally about ports per rack, ports per system, and watts per gigabits (W/G), etc. Basics like deployability, on the other hand, are rarely considered.

 

What is deployability? It’s how well a system will fit into existing facilities, not just the capacity. Juniper engineered PTX5000 and PTX3000 to be easily deployable in any facility within the existing power and cooling envelopes. While power is a huge topic today, this wasn’t always the case. Just a few years ago only the largest customers brought up power efficiency as an issue, now it’s a commonplace in RFx across customers of all sizes. Having the foresight to realize future core routers needed to maintain facility power requirements, this became a fundamental design goal for the PTX5000 and PTX3000.

 

The PTX5000 targets the largest node in broad core market space with NEBS compliance as a design rule. As we know, 20kW is the highest power for rack oriented cooling achievable before acoustics becomes unacceptable. Actually, many facilities can only provide high-power density up to 18kW per rack today. Using the more advanced cooling options under different operation rules, e.g. non-NEBS requirement, it is possible to cool more per rack before the next breakpoint hit, like 27kW, 35kW, or even beyond. This already happened in Data Center, however, many operators have not planned for that, and it may need facility upgrade at a cost. We wanted to offer an option that operators can easily adopt our systems without thinking twice. With this in mind, we established 18kW as maximum power for the 24Tbps PTX5000.

 

An 18kW design target for maximum power consumption means 0.5w/g for a 24Tbps NEBS compliant system. The broad market still mandates some hard operational rules, for example, NEBS compliance, provisioning based on maximum power draw. As a rule of thumb, an additional budget of 40~50% on top of the typical power consumption to calculate the worst-case power draw i.e. cooling failure in the facility. An 18kW maximum power target is equivalent to a 12kW typical power consumption range, hence 0.5w/g. Remember most of the systems in the market are well above 1w/g as of today. 

 

In addition to capacity and power efficiency, the PTX3000 also added “depth” as a design factor, since it’s critical for fitting into the dimension stringent ETSI racks. One of our goals was to drive converged Supercore for Packet Optical convergence; it requires a router with 300mm depth that can fit in an ETSI rack in metro facilities. This will make it possible to directly introduce IP/MPLS and optical integration directly into transport facilities, lead TDM transition to packets, and drive Packet Optical convergence, i.e. cross the boundary between Packet and Optical to drive convergence. Additionally Juniper builds the PTX Series to be front accessible, including line card, power supply, fan tray, etc., making it suitable for those operating practices in Telco, MSO Cable and other facilities, where floor with high perforations will only allow up to 1,177CFM (Cubic Feet per Min) of air, roughly the air volume needed to cool off 8kW. In the end, we established a goal for the PTX3000 to achieve 8Tbps within less than 7kW as maximum power to accommodate future growth.

 

Besides deployability discussed above, definitely there are more areas to consider. Back in 2011, the PTX was the first core router to introduce the lean core concept to the market driving network simplification, superior performance and efficiency. This was a revolutionary approach when the popular design philosophy was still focused on feature rich core routers. Since then, Juniper has built up and dominated the lean core router deployments. Many operators across different market segments have adopted the concept in an effort to simplify their network.

 

The innovations developed on the PTX series for lean core, such as on-chip lookup, and enhanced VoQ architecture, etc., achieved some KPIs that were hard to fathom for a core router. For example, extremely low forwarding latency, which was much better than some OTN switches; extremely low power consumption with NEBS compliance; and much faster convergence for network resiliency, etc.

 

From an Internet architecture point of view, the lean core is deployed at the inner core of core network, but many operators still have an outer core layer for peering, IP backbone, and other applications that require full Internet tables. So we asked ourselves whether the same benchmark achieved on lean core product would be achievable for IP/MPLS backbone and Peering router, which need full internet table. That eventually was established as another design goal. 

 

A race against physics limits

 

While these goals sound straightforward, they represent multi-dimensional nonlinear design challenges touching network architecture, system architecture, silicon design, memory selection, system packaging, power management, and thermal control. 

 

At very beginning Juniper fully recognized this would be an uphill battle. Those design goals would not be achievable if the engineering approach continued to evolve in a linear way. Actually it was a race against physics limits. Out-of-the-box thinking and rethinking engineering design rules enabled a host of technical innovations that push past what was once thought impossible beyond the physical limits.

 

A previous blog explained how we achieved efficiency at silicon and memory level, so I’ll just stay focus the discussion at system level. For silicon and memory, of course, it is far beyond just how to drive the efficiency through new technologies, for example, we had to consider the network architecture evolution, and then optimize around various table size, RTT buffer, etc. This is a topic out of the scope here, and I’d like to share in a separate blog another day.

 

To achieve something far ahead the market requires a lot of innovations. Combining packet lookup and fabric logic into a single 500Gbps full duplex chip, ExpressPlus, saves valuable board space. It enables engineering to strip more chips (six ExpressPlus chips) within the same footprint as the previous generation linecard. Figure 2 shows a high level abstraction of the line card and fabric architecture.

 

                                                                     Fig2.png

 

                                                 Figure 2. High-level abstraction of PTX series system design

 

The line card is designed in a modular way, featuring orthogonal connection to the fabric; this removed the need for high-speed trace on PTX5000 mid-plane. Modular I/O creates versatile I/O type, with the flexibility for pay-as-you-grow development pace that allows rapid adoption of new interface types, and accelerates qualification cycle for new I/O.

 

The pay-as-you-grow approach is very important for many core applications, which allows users to grow capacity gradually, and take advantage of new technologies for capacity growth in the same footprint. Without it, many customers would be forced to pay a large upfront price with low attach rate at beginning for such a large capacity system. There are other cost-saving design criteria as well, for instance, common sparing for products within the same product family. In short, we decided to focus on modularity, rather than simply tried to claim the victory and build the industry’s largest line card in a monolithic configuration.

 

The line card can accommodate two mixed I/O modules (PICs) on a common Flexible PIC Concentrator (FPC). In addition to the current shipping PICs for dense 10GE, 40GE, and100GE PICs, we are also introducing 3Tbps FPC with a few new PICs to the family.

 

                                                             Fig3.png

                                                             Figure 3 PTX Series PIC and FPC products

 

The modular design introduces a level of complexity, for example, the handle on each PIC takes up an extra two inches on each side of the PIC, and it needs additional logics and space to connect PICs to FPC. Each of these design features consumes valuable spaces, creating challenges for board layout, airflow management, and thermal control, etc. Nothing comes for free. Designers had to figure out how to optimize the whole system in a systematic way, and accommodate these requirements without sacrificing the associated benefits. To name a few. Many of the innovations on the chassis are under the hood. We reduced the high speed link length through advanced placement and routing; reduced the need for retimers to gain higher speed and performance; and employed advanced airflow management and dynamic power binning to lower the component operating temperature and heat dissipation. We also separated cooling zones to reduce unnecessary adjustments for all fan speeds; and applied extensive telemetry to monitor over 1000 sensors to control airflow and temperature, etc. 

 

When we setup 18kW as a design goal for the 24Tbps PTX5000, it was clear that we couldn’t stop there. New technologies, and new operational rules would always continue to emerge, so the system had to have sufficient margins to adopt future generation technology, and support further growth. That’s the basic rule for longevity, and we designed the system with that in mind. For example, we built a redundant modular power supply system (PSM), and not all the PSM modules would be needed to support 24Tbps. Figure 4 shows how the PDU and PSM are structured. Each redundant PDU can accommodate eight PSM modules; only four PSMs are needed for a PTX5000 fully populated with 2Tbps line cards; and six are needed when fully loaded with 3Tbps line cards. The remaining two modules are reserved for future growth. 

 

                                                                  Fig4.png

                                                                   Figure 4 PDU and PSM architecture

 

The PTX3000, which was packed into a chassis the size of a suitcase, aims for the same efficiencies as the PTX5000, especially where space, power, and performance are concerned. At 22 rack unit high and 270mm deep, with everything front accessible, the PTX3000 was built with five PSMs that can accommodate incremental capacity growth, the same longevity goal as the PTX5000. By the way, it is 270mm, less than 300mm in depth. Engineering over achieved the goal.

 

                                                               Fig5.png

                                                               Figure 5. PTX3000 PSM Architecture

 

It was considered mission impossible to pack 8Tbps into a suitcase dimension with NEBS compliance. Due to its smaller dimension, the efficiency would be lower than a bigger chassis, but we wanted to keep them in the same ballpark as much as possible. If you pay attention simply looking from outside, some innovation actually shows off itself right in front of you. Figure 6 shows a curved air filter. It increases the surface area that can pull in more air than usual compared to a regular design.

 

                                                                        Fig6.png

                                                                       Figure 6: curved air filter

 

In the end we have achieved levels of efficiency that are even better than the original goals. The 24Tbps PTX5000 hits the target for 18kW of maximum power consumption, and 12kW typical power for 0.5W/G efficiency; while the PTX3000 has achieved less than 5kW typical power consumption with less than 7kW of maximum power draws. In addition, we also over achieved on the other goals, including delivering a full IP core router with power efficiency better than previous generations of lean core hardware. As the following chart shows, while the latest PTX5000 has 6x the capacity of the first generation product, power consumption has grown less than 2x.

 

 

                                          Fig7.png

                                            Figure 7. Capacity and Power consumption between generations

 

Outlook

 

The race against physics limits is not over. After all, traffic growth will not stop. We will always need a hardcore mentality to continue pushing the limit forward. Every single such step will lift us to the next level. With that in mind, I hope people will be more excited to see what comes next.

 

This is not the last Airbender, yet!

 

                                                      Fig8.png

 

 

05.01.15
Tobias

Juniper have a great solution for the SuperCore with the PTX series, but we are not seeing Juniper going after the 100G Metro market, sometimes called DCI (Data-center interconnect).

 

This market is increasing in a very fast pace and Juniper will have to inovate again to address this market.

 

In my opinion the PTXs 5k and 3k are not suitable to the Metro 100G because of their size, even the PTX 3k is still too big.

 

Any information on that?

05.05.15
Juniper Employee

Hi Tobias, 

 

Thanks for your comments.

 

Actually we have had PTX in production deployments carrying live traffic for DCI application already. We drive DCI via our 100G Coherent transponder integration on PTX, which has flavors for both Metro and Longhaul respectively.  You may check out the product details at following url, plus another blog regarding Packet Optical convergence. 

 

http://www.juniper.net/techpubs/en_US/release-independent/junos/topics/reference/general/pic-ptx-ser...

 

http://forums.juniper.net/t5/Industry-Solutions-and-Trends/Packet-Optical-Convergence-Bring-Down-the...

 

Regarding your comments about smaller size solution, if you prefer, we can touch base with you and exchange thought on this. 

 

Thanks for feedback. 

 

Jun