SRX Services Gateway
Highlighted
SRX Services Gateway

RPM not working quite like I expect

‎10-19-2016 06:14 PM

I have a small cluster of SRX220's (v 12.3X48-D30.7). They have a single reth0 interface between them that has a couple subinterfaces tagged.  The reth terminates into a pair of EX3300's (Node0 -> VC member 0, Node1 -> VC member1).  Works greate.

 

We have two ISP links.  ISP-A is an SDWAN box with multiple aggregated DSL links, and ISP-B is a cellular backup. Normally the RPM used to failover between them is pretty trivial. However this one is REALLY trying my patience and I have to be missing something.  Because of the nature of the SDWAN box it's possible that it may be offline without really being offline. SO we are pinging out to a remote IP we own.  However the problem I am having is that once it detects the primary SDWAN connection is offline it fails over to the Cellular carrier.  However the ping test out the SDWAN interface continues to use the cellular next-hop.  You can see it in the logs:

 

Oct 19 18:11:55 18:11:55.788108:CID-1:RT:Doing DESTINATION addr route-lookup

Oct 19 18:11:55 18:11:55.788108:CID-1:RT:flow_ipv4_rt_lkup success 198.97.x.y, iifl 0x0, oifl 0x79

Oct 19 18:11:55 18:11:55.788108:CID-1:RT:Checking in-ifp from .local..0 to reth0.3 for src: 192.132.61.122 in vr_id:0

Oct 19 18:11:55 18:11:55.788108:CID-1:RT:  routed (x_dst_ip 198.97.x.y) from junos-host (.local..0 in 0) to reth0.4, Next-hop: 192.168.0.1

Oct 19 18:11:55 18:11:55.788108:CID-1:RT:flow_first_policy_search: policy search from zone junos-host-> zone Public (0x0,0x6697,0x6697)

 

I've configured this:

 

rpm {
    probe Probe-Savers-Services {
        test Ping-SSC-Router-VIP {
            target address 198.97.x.y;
            probe-count 3;
            probe-interval 5;
            test-interval 5;
            source-address 192.132.61.122;
            routing-instance sdwan;
            thresholds {
                successive-loss 10;
            }
            destination-interface reth0.3;
            next-hop 192.132.61.121;
        }
    }
}
ip-monitoring {
    policy Service-Tracking {
        match {
            rpm-probe Probe-Savers-Services;
        }
        then {
            preferred-route {
                route 0.0.0.0/0 {
                    next-hop 192.168.0.1;
                    preferred-metric 4;
                }
            }
        }
    }
}

Here is the routing-options:

 

graceful-restart;
interface-routes {
    rib-group inet rpm-group;
}
static {
    route 0.0.0.0/0 next-hop 192.132.61.121;
}
aggregate {
    route 10.19.51.0/24 policy agg-routes;
}
rib-groups {
    rpm-group {
        import-rib [ inet.0 sdwan.inet.0 ];
    }
}
autonomous-system 65001;

and the routing-instances I am trying to make work:

 

cradle-point {
    instance-type forwarding;
    routing-options {
        static {
            route 0.0.0.0/0 next-hop 192.168.0.1;
        }
    }
}
sdwan {
    instance-type forwarding;
    routing-options {
        static {
            route 0.0.0.0/0 next-hop 192.132.61.121;
        }
    }
}

and the interface:

unit 3 {
    description "SD-WAN Uplink Interface";
    vlan-id 3;
    family inet {
        address 192.132.61.122/30;
    }
}
unit 4 {
    description "Cellular Carrier";
    vlan-id 4;
    family inet {
        address 192.168.0.125/24;
    }
}

and here is the routing table when the RPM is in the "failed" state:

 

inet.0: 21 destinations, 24 routes (21 active, 0 holddown, 0 hidden)
Restart Complete
+ = Active Route, - = Last Active, * = Both

0.0.0.0/0          *[Static/4] 00:43:39, metric2 0
                    > to 192.168.0.1 via reth0.4
                    [Static/5] 1d 03:45:51
                    > to 192.132.61.121 via reth0.3

 

 What am I doing wrong?

1 REPLY 1
Highlighted
SRX Services Gateway

Re: RPM not working quite like I expect

‎10-23-2016 03:25 AM

Hi, 

 

once it detects the primary SDWAN connection is offline it fails over to the Cellular carrier.--- this can be seen from the default route next-hop change .

 

If the SDWAN comes online, is the next-hop changing back ? If it  changes back ,I think there is no probelm in this .

 

However the ping test out the SDWAN interface continues to use the cellular next-hop.--- This , I think is according to the current routing table for the given target (destiantion) . 

 

 

Regards,
Pradeep 2xJNCIE(SEC/ENT)
Feedback