SRX Services Gateway
Reply
Contributor
dark1587
Posts: 79
Registered: ‎08-01-2008
0
Accepted Solution

RETH Interface Monitoring on LACP Bundle?

Hey Everyone,

I'm running into (issues? oddities?) with a SRX1400 cluster running 11.4r1. I'm labbing out a setup for a customer using LACP to provide link redundancy and noticed some oddities/issues when I check the chassis cluster.  I set up ge-2/0/1, ge-2/0/3, ge-6/0/1, and ge-6/0/3 into reth1 on this cluster and set up LACP as per Juniper's documentation:

 

ge-2/0/1 {
    description "Trunk to EX3200 Core - reth1";
    gigether-options {
        redundant-parent reth1;
    }
}
ge-2/0/3 {
    description "Trunk to EX3200 Core - reth1";
    gigether-options {
        redundant-parent reth1;
    }
}
ge-6/0/1 {
    description "Trunk to EX3200 Core - reth1";
    gigether-options {
        redundant-parent reth1;
    }
}
ge-6/0/3 {
    description "Trunk to EX3200 Core - reth1";
    gigether-options {
        redundant-parent reth1;
    }
}

reth1 {
    description "Trunk to EX3200 Core - memebers ge-2/0/1 and ge-6/0/1 ";
    vlan-tagging;
    redundant-ether-options {
        redundancy-group 1;
        lacp {
            passive;
        }
    }
    unit 10 {
        vlan-id 10;
        family inet {
            filter {
                input ZscalerRedirect;
            }
            address a.a.a.a/a;
        }                               
    }
    unit 20 {
        vlan-id 20;
        family inet {
            address b.b.b.b/b;
        }
    }
    unit 30 {
        vlan-id 30;
        family inet {
            address c.c.c.c/c;
        }
    }
    unit 40 {
        vlan-id 40;
        family inet {
            address d.d.d.d/d;
        }
    }
}

For interface monitoring I set redundancy group 1 up with the following snippet. The behavior I want to see is that I have both interfaces on node 0 to fail before I fail over to node 1. So on node0 I set the priority to 254, priority on node1 is 2, and subtract 240 from each lost link on node0, and subtract 1 from each lost link on node1.

 

redundancy-group 1 {
    node 0 priority 254;
    node 1 priority 2;
    preempt;
    interface-monitor {
        ge-2/0/1 weight 240;
        ge-2/0/3 weight 240;
        ge-6/0/1 weight 1;
        ge-6/0/3 weight 1;
    }
}

Now here's the weird part. When I simulate a single link failure for node0, when I run "show chassis cluster status" I see the priority for node0 never change. Node0 stays at priority 254, even though jsrpd log show the subtraction. If I fail both links then the priority drops to 0 to node0 and the cluster fails over (as expected, I might add).

 

Even worse when I fail both links on node1, the status never drops to 0 at all. See below:

 

chaynes@cc5813-srx1400-pri> show interfaces ge-6* terse    
Interface               Admin Link Proto    Local                 Remote
ge-6/0/1                up    down
ge-6/0/1.10             up    down aenet    --> reth1.10
ge-6/0/1.20             up    down aenet    --> reth1.20
ge-6/0/1.30             up    down aenet    --> reth1.30
ge-6/0/1.40             up    down aenet    --> reth1.40
ge-6/0/1.32767          up    down aenet    --> reth1.32767

ge-6/0/3                up    down
ge-6/0/3.10             up    down aenet    --> reth1.10
ge-6/0/3.20             up    down aenet    --> reth1.20
ge-6/0/3.30             up    down aenet    --> reth1.30
ge-6/0/3.40             up    down aenet    --> reth1.40
ge-6/0/3.32767          up    down aenet    --> reth1.32767 
{primary:node0}
chaynes@cc5813-srx1400-pri> show chassis cluster status                     
Cluster ID: 1 
Node                  Priority          Status    Preempt  Manual failover

Redundancy group: 0 , Failover count: 1
    node0                   129         primary        no       no  
    node1                   128         secondary      no       no  

Redundancy group: 1 , Failover count: 11
    node0                   254         primary        yes      no  
    node1                   2           secondary      yes      no  

 Has anyone seen this before, or is this worth sending up to JTAC to see what they say? Am I missing something in my logic as well?


---
JNCIE-SEC #69, JNCIP-ENT, JNCSP-SEC, JNCIS-SA, JNCIS-AC, JNCIA-IDP, JNCIA-WX
Distinguished Expert
Screenie
Posts: 1,076
Registered: ‎01-10-2008
0

Re: RETH Interface Monitoring on LACP Bundle?

H'm: did you try to monitor on the reth interface with a weight of 255 ?

best regards,

Screenie.
Juniper Ambassador,
JNCIA IDP AC WX JNCIS FW SSL JNCIP SEC ENT SP JNCI

If this worked for you please flag my post as an "Accepted Solution" so others can benefit. A kudo would be cool if you think I earned it.
Contributor
dark1587
Posts: 79
Registered: ‎08-01-2008
0

Re: RETH Interface Monitoring on LACP Bundle?

In regards to node1, yes I did and the cluster status for node1 still stays at 2. However jsrpd does show that the value is being subtracted when using interface weights as 1:

 

Feb 17 17:19:28 jsrpd_ifd_msg_handler: Interface ge-6/0/1 is going down
Feb 17 17:19:28 ge-6/0/1 interface monitored by RG-1 changed state from Up to Down
Feb 17 17:19:28 intf failed, computed-weight -1
Feb 17 17:19:28 LED color changed from : Green to Amber, reason Monitored objects are down
Feb 17 17:19:28 Current threshold for rg-1 is 254. Failures: interface-monitoring
Feb 17 17:19:28 jsrpd_ifd_msg_handler: Interface ge-6/0/3 is going down
Feb 17 17:19:28 ge-6/0/3 interface monitored by RG-1 changed state from Up to Down
Feb 17 17:19:28 intf failed, computed-weight -2
Feb 17 17:19:28 Current threshold for rg-1 is 253. Failures: interface-monitoring
Feb 17 17:19:28 jsrpd_ifd_msg_handler: Interface ge-6/0/1 is going down
Feb 17 17:19:28 jsrpd_ifd_msg_handler: Interface ge-6/0/3 is going down
Feb 17 17:19:31 reth1 process
Feb 17 17:19:31 jsrpd_ifd_msg_handler: Interface reth1 is up
Feb 17 17:19:31 Unable to get RG-id and RG-state for reth ifd
Feb 17 17:19:31 Unable to update the reth1 ifd state (UNKNOWN_STATE) for RG-0

 

---
JNCIE-SEC #69, JNCIP-ENT, JNCSP-SEC, JNCIS-SA, JNCIS-AC, JNCIA-IDP, JNCIA-WX
Contributor
hmehmood
Posts: 33
Registered: ‎08-26-2011

Re: RETH Interface Monitoring on LACP Bundle?

Hi Dark,

 

I might not be able to understand your question properly, But i think your question is that why didnt node 1 priority turned to 0 when you failover both links on node 1.

 

To achive this you will have to change the monitoring weight of  ge-6/0/1 and 6/0/3 to the value that after removing both interfaces value should increase more then 255.

 

Regarding status in Show chassis cluster command this is to set the priority of nodes for RG's but for failover value on monitored objects should cross 255 for successfull failover. Please change weight on node 1 interfaces to any value 128 or above and check again. I will also test this for you in my lab.

 

I hope that this will help you :smileyhappy:

 

Thanks,

Hassan

Contributor
dark1587
Posts: 79
Registered: ‎08-01-2008
0

Re: RETH Interface Monitoring on LACP Bundle?

Hello Hassan,

That appeared to work. Thanks for the information. Now the question I have is why does the SRX have that behavior? Is it because the nodex priority is not the same as the "initial threshold" as described below?

 

A redundancy group is a collection of objects that fail over as a group. Each redundancy group monitors a set of objects (physical interfaces), and each monitored object is assigned a weight. Each redundancy group has an initial threshold of 255. When a monitored object fails, the weight of the object is subtracted from the threshold value of the redundancy group. When the threshold value reaches zero, the redundancy group fails over to the other node. As a result, all the objects associated with the redundancy group fail over as well. Graceful restart of the routing protocols enables the SRX Series device to minimize traffic disruption during a failover.

---
JNCIE-SEC #69, JNCIP-ENT, JNCSP-SEC, JNCIS-SA, JNCIS-AC, JNCIA-IDP, JNCIA-WX
Contributor
hmehmood
Posts: 33
Registered: ‎08-26-2011

Re: RETH Interface Monitoring on LACP Bundle?

Good to hear that it worked for you. Let me try to explain you this discription

 

when you are configuring cluster there are two things which shouldnt be confused together

 

1) RG

 

2) Object montoring

 

What you did is that you set priority of Node 1 to value 2 and kept the monitoring weight for interface to 1 for each assuming that for each down interface it will subtract 1 and when both will go down RG should failover to other node.

 

The priority value that you set to 2 for RG is for nodes priority and it has nothing to do with monitored objects.You can keep this value 3 or above to make node 0 primary for RG but this will not have any impact on monitored objects threshold.

 

For node priority you can keep any value for both nodes e-g you can keep priority of 3 for node 0  and 2 for node 1 but while configuring monitoring object you should always keep in mind the threshold value of 255. 

 

I hope that this will somehow clear you the difference between nodes priority for an RG and object monitoring.

 

Regards,

Hassan

Copyright© 1999-2013 Juniper Networks, Inc. All rights reserved.