We have a WXC590 (site1) connected to a WXC ISM200 (site2) over a 256k/500ms satellite connection. The server at site1 establishes a SQL session to the server at site2 (on TCP 1433). After 2-3 minutes the site1 server unexpectedly sends a TCP reset to close the session. This pattern repeats continually.
I noticed from a packet capture at the site2 server, the last 3 packets are:
server-site1 -> server-site2 TCP-Keepalive
server-site2 -> server-site1 TCP-Keepalive ACK
server-site1 -> server-site2 RST
If we turn off TCP acceleration (which also forces us to turn off NSC), everything works as expected (no resets). Do the WXCs support TCP Acceleration with MSSQL? If so is there a special trick to get it working.
We use WXOS 5.6.4. Also, I have checked that the WXC forwarding class on the routers are not dropping packets.
Yes - the WXC units support TCP acceleration of MSSQL packets, and no there aren't any strange tricks to get it working. Strange behavior. Did you open a JTAC case on this? I am assuming that you did all the obvious stuff like look in the logs for error messages.....
You would have to do 4-point packet captures to be able to identify who is generating the RST packet. From client perspective, it could be any of the WXes, or the underlying network equipment, or the originating server.
If you identify it is the WX, next point to look at are the log files during that time and flow diagnostics. For example, if you see tunnel drops at the time of error, the following might be happening: when you are using TCP acceleration and the tunnel bounces, you will expect to see a broken TCP session because AFP will break and sequence numbers were local between the WX and each end of the TCP session, so the session needs to be reestablished directly. I hope this was not too confusing, let me know if you need more detail. You can verify in flow diags what happens with the flow exactly.
This is just one possible cause, it could be a number of different reasons but it gives you idea what to look at and where.
Thanks AMS-TAC for your suggestions. My packet captures were done at 3 points: wxc-site1, server-site1 and server-site2. The resets were originated by server-site1. I don't see the WX tunnel drop at any point. As the only affected traffic was SQL, I turned off acceleration/NSC just for SQL - and the SQL TCP resets stopped.
I was wondering what else was "special" about SQL. The JUNOS-ES /etc/config/jsr-series-routermode-factory.conf file hints that by default, the SQL ALG is enabled. I turned it off with 'set security alg sql disable' and (re)enabled SQL acceleration/NSC. So far the resets have not reappeared.
So it would appear that the SQL ALG interferes with SQL TCP acceleration. It would be useful to understand the interplay of ALGs with the WXC to diagnose this further. Do you know of a show cmds or traceoptions that would help here.
I think you have enough data for a JTAC case now. I would start with a J-series case and see with them if it is an ALG issue that does something that breaks WX's acceleration. Sorry but I don't have much details to give you on the J-series troubleshooting, maybe you could try their JNet forums or go ahead with the case.