Routing
Highlighted
Routing

Dual uplink to upstream ISP & forwarding table convergence

‎08-13-2014 03:37 PM

I have a single MX router with two uplinks to my single upstream provider, where I connect to routers in two locations. In normal cases, I want all of my traffic to flow symmetrically on the primary uplink to RouterA, and only use the link to RouterB when the primary uplink is down.  I want to receive a full BGP table from both routers, and I will send out my class B to be advertised.  I am using BFD to trigger on a link failure, with a 999 millisecond interval and a multiplier of 10 (about 10 seconds).

 

Everything works well except that the failover takes longer than expected, about 70 to 80 seconds:

 

Screen Shot 2014-08-13 at 6.07.57 PM.png

If my primary link fails, the BFD kicks in and takes down the primary BGP session as expected.  By using AS prepending, I can see that router B via my backup link populates my ISP's routing tables very quickly. I can verify this by looking at my provider's BGP looking glass.  My routing table, as I am using BGP local preference to favor the route, appears to be reacting fairly quickly.  The issue is that it is taking a long time for my MX240 to properly populate the forwarding table with the updated routes.  So even though my ISP is able to update their routing table within about 12 seconds (including the BFD timer), it can take about 70 to 80 seconds before my MX router can reprogram the PFEs with the new routes.
 
Is there any mechanism where I can force the population of the forwarding table to move faster when dealing with a full Internet routing table?  I would like to get the convergence time down to about 10 - 20 seconds total..
 
I have thought about BGP load balancing with multi path, but I really want to keep my active/passive architecture. Ideally, if there was some way to populate the forwarding table with a backup entry for the active route forwarding path, this should probably solve the problem.  Is there some solution to this problem?

 

 

I am including a snippet of my local MX config below.  Can you make any suggestions?

 

Thanks.

 

Clarke Morledge

 

 

[edit policy-options policy-statement bgp-isp-router-b-out]
term local-16 {
    from {
        route-filter 192.168.0.0/16 exact;
    }
    then {
        as-path-prepend "65002 65002 65002 65002 65002 65002 65002 65002 65002";
        accept;
    }
}

[edit policy-options policy-statement bgp-isp-router-a-out]
term local-16 {
    from {
        route-filter 192.168.0.0/16 exact;
    }
    then {
        as-path-prepend "65002 65002 65002 65002 65002 65002 65002";
        accept;
    }
}
[edit policy-options policy-statement bgp-isp-router-b-in]
term default {
    then {
        local-preference 285;
        accept;
    }
}
[edit policy-options policy-statement bgp-isp-router-a-in]
term default {
    then {
        local-preference 290;
        accept;
    }
}
[protocols bgp]
group isp-router-a {
    type external;
    import bgp-isp-router-a-in;
    export bgp-isp-router-a-out;
    peer-as 65001;
    bfd-liveness-detection {
        minimum-interval 999;
        multiplier 10;
    }
    neighbor 172.16.0.2;
}
group isp-router-b {
    type external;
    import bgp-isp-router-b-in;
    export bgp-isp-router-b-out;
    peer-as 65001;
    bfd-liveness-detection {
        minimum-interval 999;
        multiplier 10;
    }
    neighbor 172.16.1.2;

 

 

 

5 REPLIES 5
Highlighted
Routing

Re: Dual uplink to upstream ISP & forwarding table convergence

‎08-13-2014 09:18 PM

If the two links are to the same provider, and you want an active/passive architechture, I don't see why you want a full BGP table.  I don't see the usefulness of a full table (or full tables) in this setup.  If you just recieved a defualt route from both peers, preferenced one, you'd have one route to converge on failover, and I imagine that would converge faster than a full table.  I'd recommend reassesing the need for a full BGP table.

Highlighted
Routing

Re: Dual uplink to upstream ISP & forwarding table convergence

‎08-14-2014 04:22 AM

I understand the advantages for only handling a default route, but I do have other reasons for retrieving a BGP full table, as I have simplified/obfuscated our topology, as we take in other BGP feeds from other providers for different networks for different purposes.  So I really can not stop taking in the full Internet table without a major rearchitectural design change, which is what I was trying to avoid in the first place. 

 

I should have clarified that from the outset of my post.

 

Clarke

Highlighted
Routing

Re: Dual uplink to upstream ISP & forwarding table convergence

‎08-14-2014 09:52 AM

Ok.  But unless you have other ISP peers that provide full tables that you use to distribute outbound/Internet traffic I'm skeptical of utility of the full tables, which I presume will be identical, from the two peers in question.  But I don't know your network of course.

 

The only possible solution I see is using ECMP with the BGP peers.  It will break the active/passive model, but in theory both routes will be in the forwarding table so there would be less control plane to forwarding place chatter that would need to take place if one peer fails.  In practice though I'm really not sure if or how much it would help the convergence time, I haven't done anything like it.  Here are a few links on the subject:

 

http://www.juniper.net/techpubs/en_US/junos13.2/topics/example/routing-selecting-multiple-equal-cost...

 

http://www.juniper.net/techpubs/en_US/junos14.1/topics/topic-map/bgp-multipath.html

Highlighted
Routing

Re: Dual uplink to upstream ISP & forwarding table convergence

‎08-21-2014 12:15 PM

I did try to load balancing & ECMP approach you referenced.   It does break my active/passive design, but there is some improvement in convergence time, but not nearly as much as I would like. 

 

The story is this: if I have a route from my upstream provider showing up on both BGP feeds, one from each ISP router, the router will install an entry for both links into the forwarding table.  If one of the links dies, and receiving a full Internet routing table, right now at about half a million IPv4 routes, it takes about 10 seconds for the traffic to begin flowing properly, once BGP updates the routing tables. It is difficult to say how much of that time is spent updating the routing table vs. deleting the stale forwarding entries.  This is not stellar performance, but it is acceptable.

 

However, if there is any variation in routes coming from my provider, it can still take up to about 90 seconds to fully move traffic over from the "dead" link to the live link. 

 

It is important to note that in my environment, I do not have a dedicated ethernet point-to-point link between my router and either of my ISP's routers.  So if I do get a fiber cut, etc. between my router and my ISP, I must rely on BFD to notify the router of the outage.

 

I am really puzzled as to why JUNOS is not failing over better.

 

Clarke Morledge

College of William and Mary

Highlighted
Routing

Re: Dual uplink to upstream ISP & forwarding table convergence

‎08-24-2014 11:21 AM

As far as I understood for the routing you need the while Internet table, but then you have an issue of scalability on your forwarding plane.

 

 

I would propose the following:

  1. You create a generate default route whose only contributing routes would be those you receive from your ISP. From your scenario either this default generate route will point to router A or to router B.
  2. Other prefixes you may receive from other ISPs will remain active (if the BGP algorithm considers them "better" than those from ISPX)
  3. When exporting (routong-options forwarding-table export) to the PFE reject all the contributing routes you received fro ISPX but do accept the generate default route. Now this step should be quicker.
  4. Because of the point 2., those particular routes will still be exported to the PFE and used for forwarding
Feedback