SRX 240H Clustering - Two Issues

last person joined: 23 hours ago

Ask questions and share experiences about the SRX Series, vSRX, and cSRX.

Back to discussions

Expand all | Collapse all

SRX 240H Clustering - Two Issues

Jump to Best Answer

1. SRX 240H Clustering - Two Issues

0 Recommend
evt
Posted 10-31-2011 05:32

Reply Reply Privately
I've currently got a pair of 240Hs in a cluster, running 11.2R3.3, and I've got a couple of questions/issues. I'm using a downstream switch with LAG to connect two one port on each of the clustered 240s. The connection works fine and traffic passes through as it should. First issue is when I disconnect the primary port - the traffic switches over to the backup, but pinging from the CLI goes from a mere 2ms to steadily and repeatedly counting down from 20ms to 10ms.

64 bytes from 172.16.10.254: icmp_seq=193 ttl=64 time=20.240 ms
64 bytes from 172.16.10.254: icmp_seq=194 ttl=64 time=19.264 ms
64 bytes from 172.16.10.254: icmp_seq=195 ttl=64 time=18.260 ms
64 bytes from 172.16.10.254: icmp_seq=196 ttl=64 time=17.249 ms
64 bytes from 172.16.10.254: icmp_seq=197 ttl=64 time=16.250 ms
64 bytes from 172.16.10.254: icmp_seq=198 ttl=64 time=15.311 ms
64 bytes from 172.16.10.254: icmp_seq=199 ttl=64 time=14.243 ms
64 bytes from 172.16.10.254: icmp_seq=200 ttl=64 time=13.236 ms
64 bytes from 172.16.10.254: icmp_seq=201 ttl=64 time=12.218 ms
64 bytes from 172.16.10.254: icmp_seq=202 ttl=64 time=11.199 ms
64 bytes from 172.16.10.254: icmp_seq=203 ttl=64 time=10.161 ms
64 bytes from 172.16.10.254: icmp_seq=204 ttl=64 time=19.260 ms
64 bytes from 172.16.10.254: icmp_seq=205 ttl=64 time=18.250 ms
64 bytes from 172.16.10.254: icmp_seq=206 ttl=64 time=17.244 ms
64 bytes from 172.16.10.254: icmp_seq=207 ttl=64 time=16.246 ms
64 bytes from 172.16.10.254: icmp_seq=208 ttl=64 time=15.244 ms
64 bytes from 172.16.10.254: icmp_seq=209 ttl=64 time=14.211 ms
64 bytes from 172.16.10.254: icmp_seq=210 ttl=64 time=13.220 ms
64 bytes from 172.16.10.254: icmp_seq=211 ttl=64 time=12.201 ms
64 bytes from 172.16.10.254: icmp_seq=212 ttl=64 time=11.210 ms
64 bytes from 172.16.10.254: icmp_seq=213 ttl=64 time=10.144 ms
64 bytes from 172.16.10.254: icmp_seq=214 ttl=64 time=9.191 ms
64 bytes from 172.16.10.254: icmp_seq=215 ttl=64 time=18.259 ms
64 bytes from 172.16.10.254: icmp_seq=216 ttl=64 time=17.248 ms
64 bytes from 172.16.10.254: icmp_seq=217 ttl=64 time=16.240 ms
64 bytes from 172.16.10.254: icmp_seq=218 ttl=64 time=15.248 ms
64 bytes from 172.16.10.254: icmp_seq=219 ttl=64 time=14.263 ms
64 bytes from 172.16.10.254: icmp_seq=220 ttl=64 time=13.243 ms
64 bytes from 172.16.10.254: icmp_seq=221 ttl=64 time=12.239 ms
64 bytes from 172.16.10.254: icmp_seq=222 ttl=64 time=11.217 ms
64 bytes from 172.16.10.254: icmp_seq=223 ttl=64 time=10.168 ms
64 bytes from 172.16.10.254: icmp_seq=224 ttl=64 time=9.217 ms

I'm not sure if this is normal behavior or not. The pattern is pretty obvious, but I have no idea why it's occurring.

The second issue was failover back to the primary. When I plugged the primary back in, both ports were up, the LAG was 'up', but I could no longer ping the switch until I disabled or unplugged the backup port. I then read this:

http://www.juniper.net/techpubs/software/junos-security/junos-security10.2/junos-security-swconfig-security/topic-43686.html

Since I'm using a single switch, does this mean that I am forced to create two LAGs on that switch, each with a single interface in it, and connect up that way? Would the same apply for a server with dual ethernet ports? I ask because that's going to turn my office into that scene from Scanners when I tell this to my server admins.
2. RE: SRX 240H Clustering - Two Issues
Best Answer

0 Recommend
Erdem
Posted 10-31-2011 13:37

Reply Reply Privately
If there is only one link going to each node, you don't need a LAG on the switch at all, just two ports in the same vlan.
The LAG is probably what causes these issues, if the switch sends the traffic to the backup node, it disappears into a black hole.

If you are connecting a server with two ports to that switch, the config you need really depends on the configuration of the server. Some servers can be configured for NIC failover, in which case there is no need for any special config on the switch. If the ports are configured as teaming on the server, you also need a LAG on the switch.
3. RE: SRX 240H Clustering - Two Issues

0 Recommend
evt
Posted 11-01-2011 04:58

Reply Reply Privately
Thanks for the response.

Any ideas why the pings are acting so weird? There's definitely a distinct pattern there. I can't say I'm too concerned with it, as I know transit traffic is likely not affected (I'm pinging from the SRX RE), so it's more of a curiosity to me.

SRX

SRX 240H Clustering - Two Issues

evt10-31-2011 05:32

Erdem10-31-2011 13:37Best Answer

evt11-01-2011 04:58

1. SRX 240H Clustering - Two Issues

2. RE: SRX 240H Clustering - Two Issues Best Answer

3. RE: SRX 240H Clustering - Two Issues

2. RE: SRX 240H Clustering - Two Issues
Best Answer