Screen OS

last person joined: 8 months ago 

This is a legacy community with limited Juniper monitoring.
Expand all | Collapse all

SSG140 | Huge udp packet losses

  • 1.  SSG140 | Huge udp packet losses

    Posted 12-19-2016 12:51

    Good afternoon,

     

    We are trying to solve out some strange issue with pair of SSG140 (6.3.0r23.0). We do see lot of UDP packet losses there, that can be 37% when traffic is around 1 (one) Mbps. So, for exampe, if we do test with 512Kbps -> almost no drops (1%), 1Mbps -> 37%.

     

    One interesting thing: it's only for outbound traffic, not for inbund ... 

     

    We tested that only on SSg140 (2 different pairs) - behaviour always the same. Any help?



  • 2.  RE: SSG140 | Huge udp packet losses

    Posted 12-19-2016 17:18

    I suspect the issue is more with PPS and not bandwidth for the issue.  I've seen this situation in running tests on other platforms using traffic generators.  the UDP streams have very small packet size so getting the higher bandwidth numbers tends to general very high packet per second streams and this PPS folds the device at much lower than expected bandwidth levels.



  • 3.  RE: SSG140 | Huge udp packet losses

    Posted 12-19-2016 23:57

    Good morning,

     

    Ok, but I don't think that 1Mbps should create any issues for SSG140 ... I do expect a bit more from that device ... Are my expectations incorrect?

     

    Just did pps "metering"

     

    PPS counting is enabled on interface(number 0) ethernet0/0
    59: 378 303

     

    PPS counting is enabled on interface(number 6) ethernet0/2
    59: 467 536

     

    From Juniper SSG 140 spec:

    Firewall packets per second (64 byte) 90,000 PPS, 90kPPS vs mine rate ... I think it;s bit too less 😞

    Also it shows that trust interface receives more traffic then untrust ... i.e. firewall discard some packets .....but I wasn;t able to find out any drops: neither screen, neither policy .... nothing ....



  • 4.  RE: SSG140 | Huge udp packet losses

     
    Posted 12-20-2016 00:24

    Hi ,

    Traffic of 400-500 PPS on SSG-140 is very less on the FW. Interface traffic handling capacity might not be causing this issue?

    Can you please answer the below mentioned queries which would help us to understand the issue better:

    + Have you made any change after which the issue appeared?
    + How do you suspect that FW is inducing the latency?
    + Can you please give us a brief idea about topology?
    + Have you tried to bypass the FW and performed the test if there is improvement in the throughout?
    + Do you face the same issue with the TCP traffic as well?
    + What are the CPU levels , Duplex settings on the interface incoming and outgoing and the polic configuration on the FW?
    + Aer your using any traffic shaping , UTM features or not ?

    Regards
    Rishi Surana



  • 5.  RE: SSG140 | Huge udp packet losses

    Posted 12-20-2016 01:10

    Good morning,

     

    + Have you made any change after which the issue appeared?

     

    No, it's a new installation


    + How do you suspect that FW is inducing the latency?

     

    I don't see latency, but I see packet drops. Even look for the pps from test above, incoming is bigger then whatever has been done by eth0/0 (untrust)

     

    Test like VM1 -> ESX1 -> SWITCH -> ESX2 -> ESX2 shows no drops

    Same test, but when it requires to be routed via SSG interface i.e.

     

    VM1 -> ESX1 -> SWITCH -> SSG140 -> SWITCH -> ESX2 -> ESX2 - .same exact amount of 37% drops


    + Can you please give us a brief idea about topology?

     

    VM Machine -> ESX -> 1000/full -> 3750-24(stack) -> 100/full -> SSG140 -> 100/full -> ISP


    + Have you tried to bypass the FW and performed the test if there is improvement in the throughout?

     

    We have no issue when laptop connected to the uplink port ("ISP port")


    + Do you face the same issue with the TCP traffic as well?

     

    Nope, only UDP affected


    + What are the CPU levels , Duplex settings on the interface incoming and outgoing and the polic configuration on the FW?

     

    PHX-CL:phx-fw1(M)-> get performance cpu
    Average System Utilization: 1%
    Last 1 minute: 2%, Last 5 minutes: 2%, Last 15 minutes: 2%

     

    Each interface configured with static 100/full/1500


    + Aer your using any traffic shaping , UTM features or not ?

     

    Nothing, just a policy.

     

     

    P.S. Just be aware that drop rate for 1Mbps traffic is always 37%, I saw once 35% and 36%, but then 99.9% of time it's 37%



  • 6.  RE: SSG140 | Huge udp packet losses

    Posted 12-20-2016 02:46
    PPS counting is enabled on interface(number 0) ethernet0/0
    59: 378 303
     
    PPS counting is enabled on interface(number 6) ethernet0/2
    59: 467 536

    Yes, clearly you are not dealing with PPS issues.

     

    Since the packets are dropped on the untrust interface check your "screen" counters and see if these increment to match the drop conditions.  Your traffic may be detected as malicious and being dropped by these protective measures.  If so you will need to adjust the untrust screen settings.

     

    get interface ethernet0/0 screen



  • 7.  RE: SSG140 | Huge udp packet losses

    Posted 12-20-2016 02:55

    I checked all possible counters, I don't see any of them are increasing ... 

     

    Also, I did ff /with debug flow drop ... all that kind of investigations I'm aware off ... it doesn't help and it doesn't explain where packets are.

     

    I'm confused ...



  • 8.  RE: SSG140 | Huge udp packet losses

    Posted 12-20-2016 03:01

    So did the debug flow drop stream show the packets exist but are not dropped?

     

    What does dbug flow basic show for the session handling of your traffic stream?

     

    Can you share the debug files?



  • 9.  RE: SSG140 | Huge udp packet losses

    Posted 12-20-2016 03:10
      |   view attached

    Fresh debug:

     

    PHX-CL:phx-fw1(M)-> get ff
    Flow filter based on:
    id:0 src ip 10.100.1.133 dst ip 216.218.227.10
    id:1 src ip 216.218.227.10 dst ip 10.100.1.133
    PHX-CL:phx-fw1(M)-> get debug
    flow: basic

     

    file is attached. I can't see anything strange here.

     

    P.S. Personal meaning - issue is related to MTU / fragmentation, but I can't find what exacty is wrong 😞

    Attachment(s)



  • 10.  RE: SSG140 | Huge udp packet losses

    Posted 12-20-2016 03:30

    Correct, the debugs clearly show the traffic is permitted and assigned to the same flow.

     

    One thing I notice is that there seems to be a lot of fragmentation on the flow.  Can you confirm the MTU settings end to end are all matching (if you haven't already).  

     

    If that checks out you could try lowering the mss to 1350.  I have occasionally seen problems with some ISP where lowering mss prevents packet loss on flows particularly when fragmentation is a factor.



  • 11.  RE: SSG140 | Huge udp packet losses

    Posted 12-20-2016 05:17

    Tried mss option, no effect, I got my "favorites" 37% of drops 😞



  • 12.  RE: SSG140 | Huge udp packet losses

     
    Posted 12-20-2016 05:45

    Hi,

     

    I can cleary understand that the there is fragmentation for the UDP traffic which most probably is causing the latency. Reducing the MSS value will only avoid the fragmentation for TCP traffic and will not be application for UDP traffic.

     

    During the time of the issue please provide me with the output of "get session frag" and also packet captures on end to end device along with snoop on the FW with debug flow basic .

     

    As we are generating 1 MBPS of traffic which is either fragmented on the device or being recieved as the fragmented traffic is might be choking the fragmentation queues on the device hence causing the packet drops.

     

    You can try to reduce the payload of UDP segment from application end or you can try to configure Path MTU to avoid fragmentation:

     

    https://kb.juniper.net/InfoCenter/index?page=content&id=kb7049&actp=search

     

    This is all we can do on the firewall to improve performance.

    Regards,

    Rishi

     

     



  • 13.  RE: SSG140 | Huge udp packet losses

    Posted 12-20-2016 06:40

    OK, Guys, I'm totaly confused now:

     

    take a look for the packet capture I did and for the snoop from juniper. Can be that juniper just dicard packets when it hit some rate?

     

    I do remember we had an issue with ISG 1000 / 2000 when it were able to drop packet without any notification once tcp timer expired (i.e. firewall was thinking that session is expired), while endpoint were trying to 're-establish" connection.

    Attachment(s)

    txt
    udp.snoop.txt   413 KB 1 version
    txt
    pcap.pcap.txt   1.29 MB 1 version


  • 14.  RE: SSG140 | Huge udp packet losses
    Best Answer

    Posted 12-21-2016 01:33

    Ok, Gentelmens, Thank you all for the help, issue found. Looks like it's all due the output queue drops on the switch itself.



  • 15.  RE: SSG140 | Huge udp packet losses

     
    Posted 12-21-2016 03:07

    Hi,

     

    Its a good news that the issue got resolved , Please refer my latest analysis and explaination as well.

     

    I request you to mark the solution as accepted or mark as Kudoes 🙂

     

    Regards,

    Rishi



  • 16.  RE: SSG140 | Huge udp packet losses

    Posted 12-21-2016 03:22

    Thanks for the update on the root cause And glad you have it working now.



  • 17.  RE: SSG140 | Huge udp packet losses

     
    Posted 12-21-2016 02:56

    Hi ,

     

    I tried to trace the UDP flow stream to investigate the drops below is my observations:

     

    + I found that in the below stream the first fragment has not being received on the device due to which the succeeding fragments are are queued and after waiting for 3 sec (default timeout) the fragments are dropped which is the root cause of this issue. This would increament fragments aged out in "get session frag".


    63052.0: ethernet0/2(i) len=1518:005056a2c0d0->0010dbff2060/8100/0800, tag 3
    207.38.68.135 -> 216.218.227.10/17
    vhl=45, tos=00, id=18645, frag=20b9, ttl=64 tlen=1500
    frag offset=1480 more fragment=1

    63052.0: ethernet0/2(i) len=1518:005056a2c0d0->0010dbff2060/8100/0800, tag 3
    207.38.68.135 -> 216.218.227.10/17
    vhl=45, tos=00, id=18645, frag=2172, ttl=64 tlen=1500
    frag offset=2960 more fragment=1

    63052.0: ethernet0/2(i) len=1518:005056a2c0d0->0010dbff2060/8100/0800, tag 3
    207.38.68.135 -> 216.218.227.10/17
    vhl=45, tos=00, id=18645, frag=222b, ttl=64 tlen=1500
    frag offset=4440 more fragment=1

    63052.0: ethernet0/2(i) len=1518:005056a2c0d0->0010dbff2060/8100/0800, tag 3
    207.38.68.135 -> 216.218.227.10/17
    vhl=45, tos=00, id=18645, frag=2000, ttl=64 tlen=1500
    frag offset=0 more fragment=1
    udp:ports 52661->5201, len=8200

    + I suspect that first fragment is getting dropped either on the upstream device to on the FW interface , to confirm the same

    + I suggested if we need the end machine is sending the segments with the size 7400 bytes which is causing the fragmentation over the network. To avoid this enable PMTU on end machine or tune the size of the segment at application layer.

    + In the meantime we can try to understand which device is dropping the traffic (first fragment)

     

    Can you please provide me with the below mentioned logs during the time of the issue for more analysis  :

    + Debug flow basic simultaneoulsy with snoop
    + get session frag
    + Get counter stat <5-10 time >

     

     

    Regards,
    Rishi