Application Acceleration
Application Acceleration

3680 SLB SLB Groups stuck in Discover Mode & Half Nat being changed to Full Nat by LB

[ Edited ]
‎05-20-2010 08:56 AM

Hi All

 

I'm hoping I can get some help on this one. Recently a colleague of mine tried to upgrade a 3680 cluster to 5.3.9. This all seemed to go ok.... code loaded to alternatie partition, set boot partition pointing at location of new code, reloaded immediately after to ensure previously running config settings were applied to new config...... but the SLB groups all showed as being in Discover mode.

 

We had problems with the cluster recently and had to power down one of the units so it was a standalone. But a couple of days prior to the code upgrade I temporarily restored the cluster and brought the secondary box back on-line so that I could reload the primary which was exhibiting the ;Overly Idle, Logging Out' message and would not allow any Mngt connections to it. Once we reloaded it and restored the cluster, I again powered down the secondary unit(LB2) returning the environment to standalone.

 

When we looked at the audit log, we could see that the LB2 was changing the NAT mode's from Half to Full and back again on the box being upgraded... as can be seen below. I'm not sure if this would have anything to do with the Discover mode problem as full NAT will only change all source traffic to the egress interface of the DX looking down at the listening servers, but even if it did, they were returned to half-nat so this should have cancelled out any problems.  The 1st box being upgraded(LB2) was the previously powered down unit in a standalone setup with no connections to the network until we powered down the previously active LB1, for some reason mosT of the LSB's on LB1 have now changed to Full-Nat. Any ideas on why this would happen ?

 

We have a JTAC open for the issue and have a really good guy working on it, but the customer is very demanding so I thought I'd tap in to this 'think tank' in parallel with his efforts to try find an answer asap.

 

Has anyone ever seen this scenario where the SLB's do not come out of Discover mode... even though the servers behind are listening on the correct ports ?

 

Any assistance or ideas is very much appreciated guys.

 

Mooey

 

 

3 REPLIES 3
Application Acceleration

Re: 3680 SLB SLB Groups stuck in Discover Mode & Half Nat being changed to Full Nat by LB

‎05-25-2010 09:51 AM

 

I do not see the log entry where it shows NAT changing from half NAT to full NAT in your post.  Settings made using the CLI are commited by issuing a 'write' statement, so if that was not done then the setting would revert after a reload. 

The entry in the audit log where it states the NAT was changed should log the time, IP, user and method (CLI or WebUI)

 

 [2010-05-25 18:39:13 (-200)] 172.26.200.4 [admin] [CLI] set slb group FTP_test nat half

 

If the CLI was used the 'write' command should also be logged:

 

 [2010-05-25 18:43:15 (-200)] 172.26.200.4 [admin] [CLI] write

 

Another possible way for the setting to change is if a config sync was done

 

For the SLB Discover mode, I am not aware of that state, where do you see this reported?  Is it in the system log?  The target servers are either Up, down or paused, do they show Discovery?

 

 

There is a Discovery mode for Unified Failover:

 

 

% show failover discovery
Discovery Interface: ether1
Discovery Port: 9401

 

Is the problem that you are using UFO failover and the peer unit is not being detected?

 

 

 

 

 

 

Application Acceleration

Re: 3680 SLB SLB Groups stuck in Discover Mode & Half Nat being changed to Full Nat by LB

‎05-25-2010 09:57 AM

Mooey:

 

What do you mean by the SLB groups being in "Discover Mode?" Can you give us a log entry or a screen shot showing this? Discovery mode is failover function, not a VIP function. What are you using for failover? SLB failover or Unified Failover?

 

Regards,

-Michael

Application Acceleration

Re: 3680 SLB SLB Groups stuck in Discover Mode & Half Nat being changed to Full Nat by LB

‎05-31-2010 02:58 AM

I have managed to get the DX into a condition where the WebUI shows 'Discover' under the Service Status colomn for all the services. 

 

This happens if I have failover enabled and I set the failover interface to an interface with no connection, so the DX cannot contact it's failover peer as well as having link failover enabled for that interface.   If I check on the CLI I see that the DX is 'idle' as it does not become active.  This is because it does not know if the other DX is active and if it did go active it could conflict with the other peer - this was introduced in 5.3.3 and is in the Release Notes as:

 


Failover

When the discovery interface link fails, the DX is now brought into “idle” mode where it

shuts off from all traffic. This works only when link Failover is enabled for that interface.

(Id: 11320)

Failure of the discovery interface link does not lead to multiple Masters in the setup anymore.

 

 

 

% show server status
Server: down (loaded config: up) (failover: Idle)

 

% show slb status
SLB: down (loaded config: up) (failover: Idle)

 

% show failover status
Failover (config file status): enabled
Unified Failover statistics (uptime = 491507 secs).
Self:9501 NodeID=168428290 Status=Idle (ether3 link is down)

 

 

 

To recover I disabled failover with 'set failover disabled'; the services then displayed 'stopped' in the WebUI dashboard.  I then 'set server down'; 'write';'set server up'; 'write' and the services started