Hey Everyone,
I'm running into (issues? oddities?) with a SRX1400 cluster running 11.4r1. I'm labbing out a setup for a customer using LACP to provide link redundancy and noticed some oddities/issues when I check the chassis cluster. I set up ge-2/0/1, ge-2/0/3, ge-6/0/1, and ge-6/0/3 into reth1 on this cluster and set up LACP as per Juniper's documentation:
ge-2/0/1 {
description "Trunk to EX3200 Core - reth1";
gigether-options {
redundant-parent reth1;
}
}
ge-2/0/3 {
description "Trunk to EX3200 Core - reth1";
gigether-options {
redundant-parent reth1;
}
}
ge-6/0/1 {
description "Trunk to EX3200 Core - reth1";
gigether-options {
redundant-parent reth1;
}
}
ge-6/0/3 {
description "Trunk to EX3200 Core - reth1";
gigether-options {
redundant-parent reth1;
}
}
reth1 {
description "Trunk to EX3200 Core - memebers ge-2/0/1 and ge-6/0/1 ";
vlan-tagging;
redundant-ether-options {
redundancy-group 1;
lacp {
passive;
}
}
unit 10 {
vlan-id 10;
family inet {
filter {
input ZscalerRedirect;
}
address a.a.a.a/a;
}
}
unit 20 {
vlan-id 20;
family inet {
address b.b.b.b/b;
}
}
unit 30 {
vlan-id 30;
family inet {
address c.c.c.c/c;
}
}
unit 40 {
vlan-id 40;
family inet {
address d.d.d.d/d;
}
}
}
For interface monitoring I set redundancy group 1 up with the following snippet. The behavior I want to see is that I have both interfaces on node 0 to fail before I fail over to node 1. So on node0 I set the priority to 254, priority on node1 is 2, and subtract 240 from each lost link on node0, and subtract 1 from each lost link on node1.
redundancy-group 1 {
node 0 priority 254;
node 1 priority 2;
preempt;
interface-monitor {
ge-2/0/1 weight 240;
ge-2/0/3 weight 240;
ge-6/0/1 weight 1;
ge-6/0/3 weight 1;
}
}
Now here's the weird part. When I simulate a single link failure for node0, when I run "show chassis cluster status" I see the priority for node0 never change. Node0 stays at priority 254, even though jsrpd log show the subtraction. If I fail both links then the priority drops to 0 to node0 and the cluster fails over (as expected, I might add).
Even worse when I fail both links on node1, the status never drops to 0 at all. See below:
chaynes@cc5813-srx1400-pri> show interfaces ge-6* terse
Interface Admin Link Proto Local Remote
ge-6/0/1 up down
ge-6/0/1.10 up down aenet --> reth1.10
ge-6/0/1.20 up down aenet --> reth1.20
ge-6/0/1.30 up down aenet --> reth1.30
ge-6/0/1.40 up down aenet --> reth1.40
ge-6/0/1.32767 up down aenet --> reth1.32767
ge-6/0/3 up down
ge-6/0/3.10 up down aenet --> reth1.10
ge-6/0/3.20 up down aenet --> reth1.20
ge-6/0/3.30 up down aenet --> reth1.30
ge-6/0/3.40 up down aenet --> reth1.40
ge-6/0/3.32767 up down aenet --> reth1.32767
{primary:node0}
chaynes@cc5813-srx1400-pri> show chassis cluster status
Cluster ID: 1
Node Priority Status Preempt Manual failover
Redundancy group: 0 , Failover count: 1
node0 129 primary no no
node1 128 secondary no no
Redundancy group: 1 , Failover count: 11
node0 254 primary yes no
node1 2 secondary yes no
Has anyone seen this before, or is this worth sending up to JTAC to see what they say? Am I missing something in my logic as well?