Routing

last person joined: yesterday 

Ask questions and share experiences about ACX Series, CTP Series, MX Series, PTX Series, SSR Series, JRR Series, and all things routing, including portfolios and protocols.
Expand all | Collapse all

Intermittent packet loss

  • 1.  Intermittent packet loss

    Posted 06-02-2010 05:13
      |   view attached

    Hi everybody.

     

    First, a short introduction: I am Joakim, 29 years old from Sweden. I work as a systems architect at a medium-sized hosting company, and during the last few months, I've been in charge of replacing our Cisco routers with Juniper ones.

     

    We decided to go for J6350's, which have been set up as drop-in replacements for Cisco 3825's. In my opinion, the original design of the network is a bit of a mess, but I didn't really have the possibility to change it.

     

    Anyway, on to my problem.

     

    As you see in the network diagram attached, I have two J6350's peering via eBGP with one transit provider each. Between the J6350's there are iBGP sessions set up, so that both routers can use routes from both eBGP peers. At present, only the transit peer to AS123 is up and running, which means the J6350 labeled a.b.42.2/23 in the diagram reaches the Internet via iBGP through a.b.42.3/23. The J6350's uses VRRP to make the address labled a.b.42.1 highly available to the firewalls, which uses it as its default gateway. As you also can see, there are a lot of LACP/EtherChannel/trunking going on, and to top it off, an RSTP tree.

     

    Since I put this setup into production on monday night, I've been having very strange problems with lost packets and what not. At first, a few hours after the whole thing was put in place, the J6350 labeled a.b.42.2/23 started dropping packets going through a.b.42.1/23. I did a reconfiguration to force a.b.42.1/23 over to the other node, a.b.42.3/24, and that made the problem dissapear. We kept running on a.b.42.3/23 as the master VRRP node, but decided to try to reset the configuration, making a.b.42.2/23 master again. This reset was done some 15 hours after the first manual failover, and when it was done, everything seemed to work OK, even with 900+ Mbps imix going through a.b.42.1/23, via both a.b.42.2/23 and a.b.42.3/23. About six hours later, the same problem appeared again. Again, a manual failover of the VRRP address a.b.42.1/23 to the backup node a.b.42.3/23 solved the problem, and it has been working as expected since then, 28 hours and counting.

     

    I think I can rule out the switches, since traffic flows just fine through all of them, up to the point where the packet loss starts, and keeps on flowing just as fine after the VRRP failover. I also think I can rule out the firewall, since the packet loss appeared on traffic that doesn't pass through the firewall aswell.

     

    I have been doing a lot of tracerouting, and can see that the routing is assymetric, which is due to the fact that traffic leaves the firewalls through a.b.42.1/23 and is returned from a.b.42.3/23, which for the moment is the only of the two J6350's that has Internet connectivity. This should be fine, though, since the Cisco routers was set up the same way as I have set up my J6350's, and the firewalls aren't configured to disallow it. I have also pondered the possibility of MAC table errors, but I haven't seen anything that indicates such a problem, and can't find any static ARP entries anywhere either. I don't see anything suspicious in the Junos log files, and frankly, I am running out of ideas on how to proceed with the troubleshooting of this problem.

     

    I would very much appreciate any input in this question. Just ask if you need any clearification of the setup or want to see configuration files.

     

    Thanks a lot in advance.

     

    Regards,

    Joakim, Sweden



  • 2.  RE: Intermittent packet loss

    Posted 06-02-2010 08:37

    So if I understand correclty, you never see packet loss when the VRRP address is active on the 42.3 router, but you see occasional packet loss when the VRRP is active on the 42.2 router.

     

    How do you experience the packet loss?  Do you ping to some place on the Internet and see pings drop?  How bad does it get?  Is there any possibility that you are, at times, saturating your ISP connection?  Are you performing rate limiting/policing anywhere?  Are you seeing any dropped packets or errors on any interfaces?



  • 3.  RE: Intermittent packet loss

    Posted 06-03-2010 01:44

    @B2 wrote:

    So if I understand correclty, you never see packet loss when the VRRP address is active on the 42.3 router, but you see occasional packet loss when the VRRP is active on the 42.2 router.

     

    How do you experience the packet loss?  Do you ping to some place on the Internet and see pings drop?  How bad does it get?  Is there any possibility that you are, at times, saturating your ISP connection?  Are you performing rate limiting/policing anywhere?  Are you seeing any dropped packets or errors on any interfaces?


    Yes, that is correct. I only see packet loss when the VRRP is active on the 42.2 router. I don't know if "occasional" is the right word, but yes, it appears out of nowhere on as it seems totally random times.

     

    I experience both lost pings and lost packets when tracing using mtr(8), but also on all other traffic going through the router. When I experience these problems, I can't get traffic properly through from my network to the Internet, and my customers can't connect to their servers on my network. I can, how ever, connect directly to the router on 42.2 from my network.

     

    I experience about 50-70% packetloss. No rate-limiting or policing anywhere in the network, and no interface errors on routers nor switches. The chances of it being because of a saturated Internet connection are very slim, as I can peak up to 1 Gbps on it, and saw only about 160 Mbps traffic when the problem appeared the last time.



  • 4.  RE: Intermittent packet loss

    Posted 06-03-2010 04:03

    If possilbe a nice reality check on the hardware would be to flip the position of the two devices physically and see if the issue moves with the hardware.

     

    Check step by step through the physical interface path and cabling when the flows are now working.  Confirm that the patch cables sweep good and all the interfaces are manually set to the same settings, especially that none have accidently gone half-duplex.



  • 5.  RE: Intermittent packet loss

    Posted 06-03-2010 05:30

     


    @spuluka wrote:

    If possilbe a nice reality check on the hardware would be to flip the position of the two devices physically and see if the issue moves with the hardware.

     

    Check step by step through the physical interface path and cabling when the flows are now working.  Confirm that the patch cables sweep good and all the interfaces are manually set to the same settings, especially that none have accidently gone half-duplex.


     

    Yes, that would be nice, but I can't do that without downtime. I can't really afford any more downtime in a while, if you know what I mean... 😉

     

    I will check all interfaces and set the media types hard instead of using auto negotiation. All cables should be fine, but I'll check and clean all the fiber connectors too while I'm at it.

     

    I found something in the dmesg of the routers, but I'm not sure when it's from, since there is no time-stamp, so it might as well be from when I was messing around with the config. I can't find the corresponding error in any log file either. It says: arp_update_iff_vrrp: IFF ge-0/0/1 doesn't have a vrrp group configured



  • 6.  RE: Intermittent packet loss

    Posted 06-03-2010 08:23

    Spuluka does have a good idea with swapping the routers, though the related downtime is a bummer.  What is your routing situation like?  Your one router with an Internet connection gets a full table then shares it with the other router via an iBGP session?  As far as you can tell has the routing table remained stable? 

     

    The message you got from dmesg is interesting, could you post the VRRP relevant portions of the router configs?



  • 7.  RE: Intermittent packet loss

    Posted 06-03-2010 09:25

     


    @B2 wrote:

    Spuluka does have a good idea with swapping the routers, though the related downtime is a bummer.  What is your routing situation like?  Your one router with an Internet connection gets a full table then shares it with the other router via an iBGP session?  As far as you can tell has the routing table remained stable? 

     

    The message you got from dmesg is interesting, could you post the VRRP relevant portions of the router configs?


     

    Yes, the router with IP a.b.42.3 gets a full table and shares it with the other router via iBGP. In about two weeks time, I'll get a second global transit connection installed on the other router, which in turn will share its table with the first one over iBGP aswell. As far as I know, there hasn't been any flaps of the iBGP session, and according to my colleague the routing table on both routers looked fine at the time of the failure. I'm not sure what he means by "fine", though, and I never had a chance to take a look at it while the problem was present.

     

    The VRRP config is real simple.

     

    interfaces {
        ge-0/0/1 {
            unit 0 {
                family inet {
                    address a.b.42.2/23 {
                        vrrp-group 1 {
                            virtual-address a.b.42.1;
                            priority 200;
                            accept-data;
                        }
                    }   
                }       
            }           
        }               
    }
    interfaces {
        ge-0/0/1 {
            unit 0 {
                family inet {
                    address a.b.42.3/23 {
                        vrrp-group 1 {
                            virtual-address a.b.42.1;
                            priority 199;
                            accept-data;
                        }
                    }   
                }       
            }           
        }               
    } 

    In my efforts to solve the problem, I activated system arp passive-learning after the problem appeared the first time, but obviously, that didn't help.



  • 8.  RE: Intermittent packet loss

    Posted 06-14-2010 04:49

    I managed to recreate the problem today, by accident. I was playing around with the firewall filters on the router that is currenctly the VRRP backup (a.b.42.2), and managed to deny everything except ssh to it. This lead to the same problems as I have had before, with up to 50% packet loss via the VRRP address (a.b.42.1). According to the log, there was no VRRP events, but the backup router ofcourse lost its routing table, since the iBGP went down. I can't wrap my brain around why this happened. The way I figure, packets leaving and entering the network through the eBGP peer on a.b.42.3 via the VRRP address a.b.42.1 shouldn't at all involve the backup router, a.b.42.2, should it? I should be able to yank the backup router right out of the rack without any worries, shouldn't I?

     

    I also found another device doing VRRP with the same VRID, hence the same MAC address, as my J6350's, but on another VLAN. I don't know if that could mess things up for the switches, but I changed the VRID to a non-conflicting one.



  • 9.  RE: Intermittent packet loss

    Posted 06-14-2010 17:54

    I definiately think you should change the VRID on one of the clusters.  Even with them in different vlans the same switch may end up seeing the same mac from different areas and get confused.

     

    As far as you main issue, obviously some traffic that passes only through the backup was forced onto the primary when your filter was in place and this is the source of the issue.  So what are the items that would qualify?

     

    Take a close look at the direct connections to the backup.

     

    When the packet loss occurs do you have a traffic monitor that can give you the bandwidth at the time?

     

    With the backup reset to normal what active sessions do you see there?



  • 10.  RE: Intermittent packet loss

    Posted 06-22-2010 07:25

    Hello,

     

    I just had the Exactly same problem.

    I found at juniper's release note this known issue -

     

    "On J Series Routers, asymmetric routing, such as tracing a route to a destination behind J Series devices with Virtual Router Redundancy Protocol (VRRP), does not work. [PR/237589]"

     

    "On SRX100, SRX210, SRX240, SRX650, and J series devices, Flow mode does not support asymmetric routing for stateful sessions. As a result of this behavior trace-route might not work when VRRP is configured across SRX devices."

     

    I have 10.0R3.1 version on my routers, but i saw this issue on many other versions.

    http://www.juniper.net/techpubs/en_US/junos10.0/information-products/topic-collections/release-notes/10/j-series-srx-series-toc.html#j-series-srx-series-toc

     

    Actually i didn't solve my problem yet because i can make changes for now. the only thing that ended the packet loss was deactivating the address in the interface(canceling the VRRP).

     

    Please update if you have any more thing.

     

     

    Thanks.

    Aharon Prat

    aharonprat@gmail.com



  • 11.  RE: Intermittent packet loss

    Posted 07-04-2010 02:10

    I have the solution.

     

    Change the router from flow mode to packet mode and tou ok.

     

    the router open session to every packet he pass through, because its asymetric route the router only open the sessions without close them. until it floods.

     

    Aharon PRat



  • 12.  RE: Intermittent packet loss

    Posted 07-23-2010 05:15

    Thanks, Aharon!

     

    That sounds like a very reasonable explanation. I will try and activate packet mode as soon as possible.

     

    EDIT: What would be the appropriate way to enable the packet-based mode for IPv4? I would have thought to use security forwarding-options family inet packet-based, but there is not family inet option under there. Do I have to set up firewall filters to do it?

     

    /Joakim



  • 13.  RE: Intermittent packet loss
    Best Answer

    Posted 07-26-2010 00:33

    hey,

     

    you need to do the follow - 

     

     

    delete security policies

    set security screen ids-option untrust-screen icmp ping-death
    set security screen ids-option untrust-screen ip source-route-option
    set security screen ids-option untrust-screen ip tear-drop
    set security screen ids-option untrust-screen tcp syn-flood alarm-threshold 1024
    set security screen ids-option untrust-screen tcp syn-flood attack-threshold 200
    set security screen ids-option untrust-screen tcp syn-flood source-threshold 1024
    set security screen ids-option untrust-screen tcp syn-flood destination-threshold 2048
    set security screen ids-option untrust-screen tcp syn-flood queue-size 2000
    set security screen ids-option untrust-screen tcp syn-flood timeout 20
    set security screen ids-option untrust-screen tcp land
    set security zones security-zone trust tcp-rst
    set security zones security-zone trust host-inbound-traffic system-services any-service
    set security zones security-zone trust host-inbound-traffic protocols all
    set security zones security-zone trust interfaces all
    set security zones security-zone untrust screen untrust-screen
    set security alg dns disable
    set security alg ftp disable
    set security alg h323 disable
    set security alg mgcp disable
    set security alg msrpc disable
    set security alg sunrpc disable
    set security alg real disable
    set security alg rsh disable
    set security alg rtsp disable
    set security alg sccp disable
    set security alg sip disable
    set security alg sql disable
    set security alg talk disable
    set security alg tftp disable
    set security alg pptp disable
    set security forwarding-options family inet6 mode packet-based
    set security forwarding-options family mpls mode packet-based
    set security forwarding-options family iso mode packet-based
    set security flow allow-dns-reply
    set security flow tcp-session no-syn-check
    set security flow tcp-session no-syn-check-in-tunnel
    set security flow tcp-session no-sequence-check

    set security screen ids-option untrust-screen icmp ping-death

    set security screen ids-option untrust-screen ip source-route-option

    set security screen ids-option untrust-screen ip tear-drop

    set security screen ids-option untrust-screen tcp syn-flood alarm-threshold 1024

    set security screen ids-option untrust-screen tcp syn-flood attack-threshold 200

    set security screen ids-option untrust-screen tcp syn-flood source-threshold 1024

    set security screen ids-option untrust-screen tcp syn-flood destination-threshold 2048

    set security screen ids-option untrust-screen tcp syn-flood queue-size 2000

    set security screen ids-option untrust-screen tcp syn-flood timeout 20

    set security screen ids-option untrust-screen tcp land

    set security zones security-zone trust tcp-rst

    set security zones security-zone trust host-inbound-traffic system-services any-service

    set security zones security-zone trust host-inbound-traffic protocols all

    set security zones security-zone trust interfaces all

    set security zones security-zone untrust screen untrust-screen

    set security alg dns disableset security alg ftp disable

    set security alg h323 disable

    set security alg mgcp disable

    set security alg msrpc disable

    set security alg sunrpc disable

    set security alg real disable

    set security alg rsh disable

    set security alg rtsp disable

    set security alg sccp disable

    set security alg sip disable

    set security alg sql disable

    set security alg talk disable

    set security alg tftp disable

    set security alg pptp disable

    set security forwarding-options family inet6 mode packet-based

    set security forwarding-options family mpls mode packet-based

    set security forwarding-options family iso mode packet-based

    set security flow allow-dns-reply

    set security flow tcp-session no-syn-check

    set security flow tcp-session no-syn-check-in-tunnel

    set security flow tcp-session no-sequence-check

     

    than you should make  a firewall filter - 

     

    set firewall filter Packet-Mode term 1 then packet-mode

     

    now you have to apply it on the interfaces.

     

    after the change check that there is no sessions on the router - 

     

    show security flow session summary 


    good luck

    Aharon Prat

    aharonprat@gmail.com

     



  • 14.  RE: Intermittent packet loss

    Posted 07-28-2010 05:19

    Cool, I will try that. Thanks.



  • 15.  RE: Intermittent packet loss

    Posted 09-10-2010 07:30

    Do I need to apply the filter to both input and output on the interfaces?

     

    I will roll out the change early monday morning, I am keeping every finger I have crossed that it solves my problem...



  • 16.  RE: Intermittent packet loss

    Posted 05-18-2012 08:36

    AharonPrat


    I can find that the device is used as router i.e. in packet-mode.

     

    # set security forwarding-options family inet6 mode packet-based
    # set security forwarding-options family mpls mode packet-based
    # set security forwarding-options family iso mode packet-based
    [ a reboot after enabling this ]

    Now, once you have configured the above, there is no significance of having any security config. So you must delete all the config under hierarchy security except the above 3 commands.
    After using the above 3 commands, you do not need to create a firewall filter for packet mode, again no significance of that.
    Firewall filter can be used for selective packet-mode on j-series & SRX-branch devices.

    If you want to use the security feature like nat, ipsec, screen etc, then do not use the above 3 commands.
    Configure the device with a single zone & a default policy disabling all alg & proposed security flow commandsby AharonPrat.
    If needed you can go for the selective packet-mode as well.