Switching

last person joined: 3 days ago 

Ask questions and share experiences about EX and QFX portfolios and all switching solutions across your data center, campus, and branch locations.
Expand all | Collapse all

random high latency issues on ex2300 series switches

  • 1.  random high latency issues on ex2300 series switches

    Posted 12-20-2018 11:50

    Wondering if anyone else is seeing this.

     

    I randomly run into really high latency issues pinging/monitoring the management IP address of some of my ex2300 series switches. Latency of 2000+ms or just plain timeouts will happen. I will start getting alerts from my SNMP monitoring system. It doesn't seem to impact the devices plugged into the switches, just the management. Sometimes I won't be able to SSH into them it gets so bad. The logs never really show anything telling of the problem. 

     

    Any ideas? 


    #latency
    #2300
    #EX


  • 2.  RE: random high latency issues on ex2300 series switches

    Posted 12-20-2018 12:07
    I know the ex2300’s does not have a very powerful routing engine CPU and have seen when aggressively monitoring the device RE utilization gets high and icmp can be delayed.

    I would see if your device has high utilization during these events and if so, what process is using these resources.

    https://kb.juniper.net/InfoCenter/index?page=content&id=KB26261

    (on mobile, please excuse any misspelling)


  • 3.  RE: random high latency issues on ex2300 series switches

    Posted 12-26-2018 04:00

    Thanks for the response, that is one thing I'm trying to keep an eye on.

    Do you know of a way to get more of a continous update on resource useage, like TOP? 



  • 4.  RE: random high latency issues on ex2300 series switches

    Posted 12-26-2018 15:03

    The junos command from kb26261 listed above is essentially running top from inside junos.   This will help you identify what is using the cpu during incidents.

     

    show system processes extensive

     

    https://kb.juniper.net/InfoCenter/index?page=content&id=KB26261

     



  • 5.  RE: random high latency issues on ex2300 series switches

    Posted 12-27-2018 07:45

    Yeah, I was looking for something that isn't static, like how TOP continously updates those fields.

    Maybe that isn't an option with this command. 



  • 6.  RE: random high latency issues on ex2300 series switches

    Posted 12-29-2018 08:49

    You can drop to the shell and run top directly too.

     

    start shell

     



  • 7.  RE: random high latency issues on ex2300 series switches

     
    Posted 07-19-2019 05:01

    Good Day, i am seeing this same issue still. Anyone know why this is happening. Previously these switches were completely inaccesable, only after i picked up on some IPV6 ICMP traffic on the network and blocked it the switches become accessible again, however i am still seeing from time to time this happeing. I am also concerend that it might be effecting passthought traffic aswell.



  • 8.  RE: random high latency issues on ex2300 series switches

    Posted 12-05-2019 03:05

    I too have this problem. Only EX2300's are affected, even the old EX2200 soldier on untroubled. I have opened an SR back in October and I now have a third guy (now on secon level) looking at it. Problem is that nothing seems to happen, after a few initial messages with each tech radio silence follows. Do you guys have any advice?



  • 9.  RE: random high latency issues on ex2300 series switches

     
    Posted 12-05-2019 05:01
    Hello all,

    In general, if we are pinging the switch IP itself then it hits the routing-engine (RE) and takes CPU cycles for the switch to respond. However let me throw out a few options and I’m confident one or the other in these should help everyone 😊

    1) Please check “show system processes extensive | except 0.00” (for better reading) at the time of testing with ping. Then troubleshoot and do the needful based on your process. For example, a simple config issue I’ve seen is when “jdhcpd” spikes it was due to the following config left on the box:

    set interfaces irb unit 0 family inet dhcp vendor-id Juniper-ex2300-48t
    set interfaces vme unit 0 family inet dhcp vendor-id Juniper-ex2300-48t

    Please don’t tell me this isn’t the problem in your case, it’s just a sample possible thing.

    2) For the EX2300, in some cases I’ve seen an incorrect date configured has caused this issue. So please configure current date on the switch using below mentioned CLIcommand.
    user@roor> set date YYYYMMDDHHMM.ss


    3) Enable "set system no-redirects":
    https://www.juniper.net/documentation/en_US/junos/topics/topic-map/icmp.html#id-understanding-the-protocol-redirect-mechanism-on-switches<>
    https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/no-redirects-edit-system.html

    4) Please try to use one of the recommended JTAC software if not already using it and test again, unless you have a fair reason/recommendation (for example another bug fix) to use another Junos version : https://kb.juniper.net/InfoCenter/index?page=content&id=KB21476

    5) What is the order of latency you’re seeing? If it’s just 10’s of ms, it could be expected due to low priority given to ICMP traffic to the device itself. Transit traffic latency should be just fine.
    https://kb.juniper.net/InfoCenter/index?page=content&id=KB27335


    @HH, I’m sorry to hear that as JTAC usually ROCKS! Silence is usually a 2-sided affair, so if you feel there’s dead air on a case, ask them to get help and you will see more traction. Just my 2c.


    Hope this helps and hope you’ll like the post.


    Thanks and Regards,
    -r.

    --------------------------------------------------

    If this solves your problem, please mark this post as "Accepted Solution."
    Kudos are always appreciated :).


  • 10.  RE: random high latency issues on ex2300 series switches

    Posted 12-09-2019 10:38

     

    Thaks for the advice!

     

    I'm actually not concened by longish ping delays, I'm more concerned by the periods, when the switch is totally unresponsive to connections to admin address. This happens after increasing ping times (but not every time). During these periods e.g. ssh connection to admin IP stops for some time. Longest such periods are on the order of one to two minutes (as reported by network monitoring). All but one of our EX2300 units run recommended software (15.1X53-D591). Most units have 802.1x authentication and dynamic VLAN assignment configured (and this seems to be affected by both the delays and traffic blocks). The delay/blocking behaviour is the same with or without dot1x configuration. All swithces run ntp to get correct time and I routinely remove the "irb unit 0 family inet dhcp vendor-id" from all Juniper switches.

     

    PS: Thus far I have not seen this behaviour in EX2300-MP units. They have an irritating feature of their own (fxpc process dumping core about once a week).



  • 11.  RE: random high latency issues on ex2300 series switches

    Posted 01-30-2020 08:09

    We have these issues too. We'd be happy to run the recommended code from the 18 train, but when we install it, etherports aren't recognized/don't come up/stay up.

     

    JTAC has been unable to assist. 



  • 12.  RE: random high latency issues on ex2300 series switches

    Posted 01-30-2020 10:42

    Hello everyone,

     

    We resolved these issues on our ex2300's by getting rid of any unnecessary DHCP commands in the configs. This was mentioned in one of the posts above. We also saw the issue from time to time because we had some DHCP trace logging happening, and it would spike the CPU.

    From my findings the key to solving this ussue is to watch the resource useage on the switch, if the cpu use is high ICMP traffic will not get prioritized, I've seen DHCP start failing as well becuase the cpu use was too high. 

     

    We successfuly upgraded most of our ex2300's from the 15.1 versions to 18.2R3 without issue. 



  • 13.  RE: random high latency issues on ex2300 series switches

     
    Posted 12-05-2019 13:52

    If there's no impact to traffic, we can rule out the possibility of a layer 2 loop or storm kind of time.

     

    Since only ICMP (maybe other protocols) towards control plane is impacted, we need to check if there's any high CPU, if the router/switch is protected well against untrusted traffic by lo0 filter, any policier/filter/arp drops. All those needs to be checked through console during problem states