Junos
Highlighted
Junos

Tail-dropped packets EX4200

‎02-12-2019 06:35 AM

Hello, i have 10 switchs EX4200-48T, and i got tail-drops on ge-0/0/ interfaces:

 

> show interfaces statistics | match Error | except Link-level | except "Output errors: 0" 
  Input errors: 0, Output errors: 6997
  Input errors: 0, Output errors: 428
  Input errors: 0, Output errors: 8424
  Input errors: 0, Output errors: 7107
  Input errors: 0, Output errors: 8544
  Input errors: 0, Output errors: 8731
  Input errors: 0, Output errors: 8299
  Input errors: 0, Output errors: 8225
  Input errors: 0, Output errors: 8252
  Input errors: 0, Output errors: 8402
  Input errors: 0, Output errors: 8330
  Input errors: 0, Output errors: 6992
  Input errors: 0, Output errors: 6981
  Input errors: 0, Output errors: 8136
  Input errors: 0, Output errors: 8461
  Input errors: 0, Output errors: 8093
  Input errors: 0, Output errors: 8075
  Input errors: 0, Output errors: 8265
  Input errors: 0, Output errors: 8298
  Input errors: 0, Output errors: 8380
  Input errors: 0, Output errors: 8588
  Input errors: 0, Output errors: 8429
  Input errors: 0, Output errors: 8512
  Input errors: 0, Output errors: 8582
  Input errors: 0, Output errors: 8629
Forwarding classes: 16 supported, 4 in use
Egress queues: 8 supported, 4 in use
Queue: 0, Forwarding classes: best-effort
  Queued:
  Transmitted:
    Packets              :                337719
    Bytes                :              74149542
    Tail-dropped packets :                  9669
    RL-dropped packets   :                     0
    RL-dropped bytes     :                     0

 

But on some ports there is almost no traffic. And it happens on all switches

  Input rate     : 0 bps (0 pps)
  Output rate    : 16944 bps (32 pps)
  Input errors: 0, Output errors: 9418

 

I tried to change buffer size, but this is not helped for me.

> show configuration class-of-service    
shared-buffer {
    percent 100;
}

Thank you.

17 REPLIES 17
Highlighted
Junos

Re: Tail-dropped packets EX4200

‎05-16-2019 11:46 AM

Hi,

 

taildropped packets happen because of buffer exhaustion do you have a custom CoS configuration on the device?

I help you, you help me... please share a Kudos or accepted solution whenever you feel I have helped with your problem! Smiley Happy
Highlighted
Junos

Re: Tail-dropped packets EX4200

‎05-17-2019 02:52 AM

Hi,

It also can be caused by flow-control protocol. Please, disable flow-control on a link.

Highlighted
Junos

Re: Tail-dropped packets EX4200

[ Edited ]
‎05-19-2019 06:02 PM

You can try giving more bandwidth to best effort queue and see if that improves, check also if other queues are taking up more bandwith than necessary.

 

Another theory is that the switch may be experiencing bottle neck , for example traffic coming from 10gb is going out to 1gb .

 

Please Mark My Solution Accepted if it Helped, Kudos are Appreciated too! Smiley Happy

 

Highlighted
Junos

Re: Tail-dropped packets EX4200

‎06-06-2019 04:15 AM

Good Day,

 

Could you please provide next output for the one of the affected interfaces, so we can check QoS settings:

show class-of-service interface ge-0/3/0 comprehensive

 

Thanks!

Highlighted
Junos

Re: Tail-dropped packets EX4200

[ Edited ]
‎09-17-2019 07:05 AM

Good day, I apologize for the long time reply. Thank you all for your responses.

 

On xe-0/0/0 interface:

MAC control frames 0 0
MAC pause frames 0 0

 

And same on all others interfaces

Highlighted
Junos

Re: Tail-dropped packets EX4200

‎09-17-2019 07:07 AM

On problem interface

Input rate : 3847176 bps (387 pps)
Output rate : 584248 bps (371 pps)
Input errors: 0, Output errors: 2145011

 

I see same problem on empty servers where i had rate about 1pps

Highlighted
Junos

Re: Tail-dropped packets EX4200

‎09-17-2019 07:09 AM
> show class-of-service interface ge-0/3/0 comprehensive 
error: command is not valid on the ex4200-48t
> show class-of-service interface ge-0/0/3    
Physical interface: ge-0/0/3, Index: 133
Maximum usable queues: 8, Queues in use: 4
  Scheduler map: <default>, Index: 2
  Congestion-notification: Disabled

  Logical interface: ge-0/0/3.0, Index: 74
Object                  Name                   Type                    Index
Classifier              ieee8021p-untrust      untrust                    16
Highlighted
Junos

Re: Tail-dropped packets EX4200

‎10-29-2019 03:03 AM

Good Day Roman90,

 

I strongly believe there is a configuration issue.

Once interface with very low outgoing traffic is suffering from drops and it is seen on many devices - could be that queue just doesn't have any dedicated bandwidth, according to the configuration.

Could you please provide class-of-service part of the configuration for the review?

 

Thank you!

Highlighted
Junos

Re: Tail-dropped packets EX4200

‎11-01-2019 04:07 AM

Hej

It is hard to address without seeing the CoS config. Can you provide the full CoS config for the problem interface?

#show class-of-service interface

 

Do you only see the problem in BE queue or other queues drop traffic as well?
>show interface <logical interface> queue

 

Also are you monitoring your network? You might be getting bursts in certain times that might be causing the drops. That is why other times it might appear as there is no traffic.

Regards
Oscar

Highlighted
Junos

Re: Tail-dropped packets EX4200

‎02-11-2020 10:39 AM

Hi,

 

Packet congestion may seem difficult to troubleshoot and resolve, oversubscription and bursty traffic may be causing this.

 

One question, did you get better or worst results after applying the 'shared-buffer' to 100% compared to the 50% by default?

 

Benjamin

Highlighted
Junos

Re: Tail-dropped packets EX4200

‎03-13-2020 03:50 PM

Hey Guys,

 

Just to say we're seeing the same thing to.

We also got recommended to up the shared buffer to 100% and it did work for a little while and has decreased the effect of the issue (however, it should be noted that he mentioned the buffer is 95% by default, not 50%).

 

I'm guessing either the device is on its way out, or there is microbursting at play - of which most devices don't play nicely with.

Highlighted
Junos

Re: Tail-dropped packets EX4200

‎03-14-2020 06:40 AM

If you can identify the ingress/egress port pairs where the congestion is occuring you might be able to alleviate the issue by moving them to the same switch and chip in a virtual chassis to minimize the path required and potential for congestion.

 

Steve Puluka BSEET - Juniper Ambassador
IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
http://puluka.com/home
Highlighted
Junos

Re: Tail-dropped packets EX4200

[ Edited ]
‎03-15-2020 01:14 PM

This is something I want to try - but somehow we've ended up with a 4 member stack where 2 members are fibre only and the other 2 are copper only, but all the customers are on the copper members and the 10gig uplinks are on the fibre members. I think the copper members don't have the uplink card to support 10g uplink at the moment and I'm not even sure it's available for the devices. But even if they were, that's not a hot-swappable part, so we'd have to turn everything off take the card out of the fibre devices, put it in the copper devices then power everything back on.

 

At that point, might as well look into 4300s if the company can afford it. Conversationally, the dedicated VC ports are supposedly 32Gb/s interfaces (I'm just going off of the output of "show virtual-chasssi vc-port") and the 10G uplinks have ~2-3Gb/s of traffic going through them, so whilst the path is hugely inefficient as everything has to travel through a VC port and then an uplink to get out, I don't see how we're even close to bottle-necking. That's why it looks more like glitch/bug/hardware issue or micro-bursting (impossible to see on graphing unless you have something called telemtry?) - though it is understood that the 4200/4550's queing system is out-dated. Also, we have many other stacks in various implementations (some performing a lot of routing, some just providing pure L2 capabilities) which aren't having this issue - also, most people who experience this seem to be on 15.X code so wondering if there's anything going on there...

 

Anyway, it is an interesting one, if we get the downtime, re-cabling a more efficient setup will be interesting to try. I think it was mentioned each 24-port section has an ASIC dedicated to it, so was thinking of maybe distributing ports in the fashion of:

ge-0/0/0 : ge-0/0/24

ge-0/0/1 : ge-0/0/25

ge-0/0/2 : ge-0/0/26

.......

ge-0/0/23 : ge-0/0/48

Highlighted
Junos

Re: Tail-dropped packets EX4200

‎03-16-2020 02:57 AM

Since you might not be hitting traffic limits do you have a class of service configuration in place where the queues might be the issue generating the tail drops?

 

Steve Puluka BSEET - Juniper Ambassador
IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
http://puluka.com/home
Highlighted
Junos

Re: Tail-dropped packets EX4200

‎03-19-2020 05:52 PM

I think there is actually CoS on a single port.

JTAC didn't seem to mention it though.

Since we've seen CPU increase, we had a look at what was causing the main CPU load - the stack being polled seemed to make the mib2 process spike up quite frequently - for a while we thought maybe that was causing it. But, we disabled SNMP completely - for the first 30 minutes, there were no drops, then I checked back about 10 hours later and I sadly see some drops had appeared.

 

Re-enabling SNMP polling access to the device just increases the speed in which the drops occur. I will take a look at the CoS config on the single port tomorrow and see if anything pops out. I find it weird that a single port's config would cause basically every other active switch port to drop packets - but frequently in the world of computers and networking, 1 digit/character can cause catastrophically different results. It's also strange that the interface that has been doing the most traffic is the one with the least drops and is not the port that has been configured with CoS.

 

One weird thing to note is that there are zero drops on the 10gig uplink interfaces in any direction and they're technically doing the most traffic; they are however on the other two members which are separate to the copper members that are currenly experiencing issues. And whilst drops happen on almost every port...it's NOT EVERY port, but it does always appear to be the same 10 ports spread across two members.

Highlighted
Junos

Re: Tail-dropped packets EX4200

‎03-20-2020 10:42 AM
#Checking for config with CoS
show configuration | match "class|cos" |except login | display set
set class-of-service shared-buffer percent 100
set class-of-service interfaces ge-2/0/14 shaping-rate 200m

#Checking Interface with CoS - also matches the filter and policer names
show configuration | match "ge-2/0/14|-sanitised-|200M-firewall-filter" | display set
set interfaces ge-2/0/14 description "RESERVED QinQ - #HIDDEN#"
set interfaces ge-2/0/14 unit 0 family ethernet-switching port-mode access
set interfaces ge-2/0/14 unit 0 family ethernet-switching vlan members 957
set interfaces ge-2/0/14 unit 0 family ethernet-switching filter input -sanitised-
set class-of-service interfaces ge-2/0/14 shaping-rate 200m
set firewall family ethernet-switching filter -sanitised- term 1 then accept
set firewall family ethernet-switching filter -sanitised- term 1 then policer 200M-firewall-filter
set firewall family ethernet-switching filter -sanitised- term 2 then accept
set firewall policer 200M-firewall-filter if-exceeding bandwidth-limit 200m
set firewall policer 200M-firewall-filter if-exceeding burst-size-limit 125k
set firewall policer 200M-firewall-filter then discard

#Surprisingly, this interface has drops today - hasn't had it before (don't worry about the Half Duplex, it's a cosmetic error with the 15.1 code)

show interfaces ge-2/0/14 extensive | match "phy|speed|duplex|error|bps|flap"
Physical interface: ge-2/0/14, Enabled, Physical link is Up
  Link-level type: Ethernet, MTU: 1514, LAN-PHY mode, Speed: Auto, Duplex: Half-duplex, BPDU Error: None, MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled,
  Last flapped   : 2018-09-11 06:17:45 UTC (79w3d 11:17 ago)
   Input  bytes  :           5007373448               441112 bps
   Output bytes  :            891298374                45256 bps
  Input errors:
    Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Policed discards: 0, L3 incompletes: 0, L2 channel errors: 0, L2 mismatch timeouts: 0, FIFO errors: 0, Resource errors: 0
  Output errors:
    Carrier transitions: 0, Errors: 0, Drops: 16, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    CRC/Align errors                         0                0
    FIFO errors                              0                0
        Link mode: Full-duplex, Flow control: None, Remote fault: OK, Link partner Speed: 1000 Mbps
                              %            bps     %           usec
     Input  bytes  :                    0                    0 bps
     Output bytes  :                    0                    0 bps


#In comparison, all the other interfaces that have drops
 show interfaces extensive | match drops | no-more | except "Drops: 0"
    Carrier transitions: 0, Errors: 0, Drops: 216210, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 20, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 9029, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 297, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 12361, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 5609, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 16, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 17, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 1093, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 6989, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 3480, Collisions: 0, Aged packets: 0, FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
    Carrier transitions: 0, Errors: 0, Drops: 31859, MTU errors: 0, Resource errors: 0

#Doubt this is important but - classifier for the CoS interface and the interface that has the most drops is the same (I think all interfaces are in this default Classifier)
 show class-of-service interface ge-2/0/14.0
  Logical interface: ge-2/0/14.0, Index: 150
Object                  Name                   Type                    Index
Classifier              ieee8021p-untrust      untrust                    16

 show class-of-service interface ae4.0
  Logical interface: ae4.0, Index: 75
Object                  Name                   Type                    Index
Classifier              ieee8021p-untrust      untrust                    16

show class-of-service classifier name ieee8021p-untrust
Classifier: ieee8021p-untrust, Code point type: ieee-802.1, Index: 16
  Code point         Forwarding class                    Loss priority
  000                best-effort                         low
  001                best-effort                         low
  010                best-effort                         low
  011                best-effort                         low
  100                best-effort                         low
  101                best-effort                         low
  110                best-effort                         low
  111                best-effort                         low

Looks like we're going to have to start reading the CoS books to get this to work the way we want - may try disabling the CoS we have in place now just to see if it affects it Smiley Happy

Highlighted
Junos

Re: Tail-dropped packets EX4200

‎03-25-2020 05:37 AM

Nothing worked in the end.

Disabling CoS just made it worse and drops started happening on interfaces is wasnt previously happening on.

We're going to try and upgrade to 4300s.

Code 12.1 works fine - but I doubt I'd be able to convince a downgrade. Latest 15.x SR code mentions nothing about this issue so probably not addressed / not actually an issue (micro-burst etc). Pinpointing the actual issue is going to be too difficult (extremely sporadic and drops happen across multiple ports).

It is annoying that it doesn't look like the device is being maxed out, but it's probably getting destroyed at sub millisecond speed; poor thing Smiley Happy

Thanks for the suggestions.