Junos
Highlighted
Junos

High routing engine CPU because of snmp / mib2d process

‎05-01-2020 02:04 AM

Hello,

 

I got an MX5 router with a significant amount of interfaces (IFLs).

It could be the reason why my NMS monitors a constant CPU in use % of 100%.

Regardless of the snmp filter I'm using (you can see below, it will only get all physical interfaces), there are time where it takes forever to do a interface description snmp mib walk..

 

The routing engine is not always at high % in the NMS, the NMS only shows it because it's polling every 5 min.

 

'show chassis routing-engine' always shows the user at 74% and Kernel at 22%.

 

What can I check or fix?

 

beelze@ams-nik-er2> show configuration snmp 
location "[52.355980, 4.950350] // Nikhef, Science Park 105, Amsterdam, the Netherlands";
filter-interfaces {
    interfaces {
        cbp0;
        demux0;
        gre0;
        tap;
        gre;
        ipip;
        pime;
        pimd;
        mtun;
        pip0;
        dsc;
        irb;
        pp0;
        lsi;
        ip-*;
        lt-*;
        mt-*;
        pe-*;
        pfe-*;
        pfh-*;
        ut-*;
        vt-*;
        ".*\.32767";
        ".*\.16384";
        pd-*;
        lc-*;
        ".*\.32768";
        ".*\.16386";
        ".*\.16385";
        jsrv.*;
        esi;
        fxp0;
        "!(ge-.*/[0-9]$|ge-.*/1[0-9]$|xe-.*/[0-9]$|xe-.*/1[0-9]$|ae[0-9]$|lo0$|^ge-1/0/6.287|^ge-1/0/6.288)";
    }
    all-internal-interfaces;
}
filter-duplicates;
community "ComSav3311!!" {
    authorization read-only;
    client-list-name MANAGEMENT;
}
beelze@ams-nik-er2> show system processes extensive | except 0.00    
last pid: 33774;  load averages:  4.08,  3.79,  3.53  up 762+06:50:43    11:01:54
165 processes: 8 running, 129 sleeping, 28 waiting

Mem: 1413M Active, 168M Inact, 227M Wired, 63M Cache, 112M Buf, 111M Free
Swap: 2821M Total, 2821M Free


  PID USERNAME         THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
 1739 root               1  76    0   125M   110M RUN    4556.7 45.17% mib2d
33620 root               1  76    0   132M 11052K RUN      0:15 11.74% mgd
33614 root               1  76    0  8684K  3580K RUN      0:07  5.00% sshd
 1817 root               1   4    0   467M   418M kqread 490.4H  2.49% rpd
 1924 root               1  43    0 94364K 52012K select 193.5H  1.95% dcd
 1422 root               1  70    0 16012K  8804K RUN    558.5H  1.90% eventd
 1836 root               3  63    0   118M 65276K sigwai 268.2H  1.86% jpppd
   11 root               1 -56 -159     0K    16K WAIT   252.5H  1.03% swi2: netisr 0
33618 oxidized-comsav    1  57    0 53944K 42800K select   0:03  0.98% cli
 1669 root              11  42    0 21196K 11452K ucond  582.2H  0.98% clksyncd
 1859 root               1  58    0 37220K 29888K select 1343.6  0.83% snmpd
15662 root               3  60    0 95796K 56572K sigwai  77.8H  0.15% pppoed
 1663 root               1  43    0 52200K 44724K select 228.2H  0.10% ppmd
 1832 root               1  43    0 82516K 32756K select 142.5H  0.05% jdhcpd
 1682 root               1  41    0 16076K  9696K select  23.8H  0.05% license-check
33082 root               1  42    0  8680K  3612K select   0:01  0.05% sshd
beelze@ams-nik-er2> show chassis routing-engine    
Routing Engine status:
    Temperature                 42 degrees C / 107 degrees F
    CPU temperature             51 degrees C / 123 degrees F
    DRAM                      2048 MB (2048 MB installed)
    Memory utilization          92 percent
    CPU utilization:
      User                      74 percent
      Background                 0 percent
      Kernel                    22 percent
      Interrupt                  4 percent
      Idle                       0 percent
    Model                          RE-MX5-T
    Serial ID                      S/N CABS8179
    Start time                     2018-03-31 04:11:41 CEST
    Uptime                         762 days, 6 hours, 51 minutes, 19 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       4.52       4.06       3.66
6 REPLIES 6
Highlighted
Junos

Re: High routing engine CPU because of snmp / mib2d process

‎05-01-2020 09:18 AM

Hi ,

 

Greetings

Can you share show snmp stats-response-statistics and also see can you limit the stats to only to IFD's , not for all the IFLs.

MX5 is PPC [Power PC] model hence polling all the scaled IFLs defintely you will see high CPU in SNMP polling on scaled IFL scenario.

 

https://www.juniper.net/documentation/en_US/junos/topics/reference/command-summary/show-snmp-stats-r...

 

Thanks

Highlighted
Junos

Re: High routing engine CPU because of snmp / mib2d process

[ Edited ]
‎05-04-2020 12:48 AM

Here you go:

 

beelze@ams-nik-er2> show snmp stats-response-statistics    

Average response time statistics:
Stats                Stats                    Average
Type                 Responses                Response
                                              Time (ms)
ifd(non ae)          99547                    110.00
ifd(ae)              10956                    24.00
ifl(non ae)          10770                    37.56
ifl(ae)              46226                    142.69
firewall             677                      38562.82

Bucket statistics:
Bucket               Stats
Type(ms)             Responses
0 - 10               148282               
11 - 50              15085                
51 - 100             4074                 
101 - 200            605                  
201 - 500            86                   
501 - 1000           17                   
1001 - 2000          2                    
2001 - 5000          2                    
More than 5001       23                   

Bad responses:
Response        Request                Stats          Key
                Time                   Type
(ms)            (UTC)
10320.26        2020-05-01 14:34:03    firewall       25Mb
10319.96        2020-05-01 14:34:03    firewall       5Mb
10319.71        2020-05-01 14:34:03    firewall       ACCEPT-PPPOE-ONLY-IN
10319.45        2020-05-01 14:34:03    firewall       ACCEPT-PPPOE-ONLY-OUT
10311.83        2020-05-01 14:34:03    firewall       PROTECT-ROUTER-v4
10301.85        2020-05-01 14:34:03    firewall       PartnerCNE-Data_Down
10301.59        2020-05-01 14:34:03    firewall       PartnerCNE-Data_Up
10301.34        2020-05-01 14:34:03    firewall       PartnerCNE-Voice_Down
10301.11        2020-05-01 14:34:03    firewall       PartnerCNE-Voice_Up
10296.41        2020-05-01 14:34:03    firewall       SilverMobilityKatwijk
10291.72        2020-05-01 14:34:03    firewall       l3vpn-horizon
10291.47        2020-05-01 14:34:03    firewall       l3vpn-libernet
10291.22        2020-05-01 14:34:03    firewall       l3vpn-mica
10290.99        2020-05-01 14:34:03    firewall       police-2M-xe-1/3/0.11427-i
10290.75        2020-05-01 14:34:03    firewall       police-2M-xe-1/3/0.11427-o
10290.52        2020-05-01 14:34:03    firewall       police-48M-xe-1/3/0.10427-i
10290.28        2020-05-01 14:34:03    firewall       police-48M-xe-1/3/0.10427-o
10286.56        2020-05-01 14:34:03    firewall       urpf-filter4
10286.15        2020-05-01 14:34:03    firewall       urpf-filter6
10285.91        2020-05-01 14:34:03    firewall       __default_bpdu_filter__

 

Is there a way to restrict the NMS from polling firewall filters?

That seems to be the bottleneck here.

Highlighted
Junos

Re: High routing engine CPU because of snmp / mib2d process

‎05-05-2020 05:39 AM

Bump.

 

Tried to block SNMP with below view configuration.

Does not really do much in regard to routing engine CPU %.

 

view test {
    oid .1 include;
    oid 1.3.6.1.4.1.2636.3.5 exclude;
}
community "abc123456789" {
    view test;
}
Highlighted
Junos

Re: High routing engine CPU because of snmp / mib2d process

‎06-04-2020 09:26 PM

Hello Beeelzebub,

 

Good day!
I happened to come accross this post. I ust wanted to understand the following?

 

1. is there any bulk SNMP requests being sent? If yes, limit to a lesser no. of commands per poll rather than bulk. Please check this out: https://kb.juniper.net/InfoCenter/index?page=content&id=KB30713&cat=EX3300&actp=LIST

 

2. Do you see any continuous snmp related log messages in the logs?

 

3. Are the processes continuously high? Can you run the command to check if the processes continuously use 45.17% of the CPU.
show system processes extensive | refresh 10

 

Can you possibly try to de-activate SNMP configuration on the device to see how the CPU behaves?


Regards,
Vishaal


Accept as Solution = cool ! (Help fellow community members with similar query be redirected here instead of them reposting again)
Accept as Solution+Kudo = You are a Star !
Highlighted
Junos

Re: High routing engine CPU because of snmp / mib2d process

‎06-05-2020 12:27 AM

Hello Beeelzebub,

 

beelze@ams-nik-er2> show system processes extensive | except 0.00    
last pid: 33774;  load averages:  4.08,  3.79,  3.53  up 762+06:50:43    11:01:54
165 processes: 8 running, 129 sleeping, 28 waiting

Mem: 1413M Active, 168M Inact, 227M Wired, 63M Cache, 112M Buf, 111M Free
Swap: 2821M Total, 2821M Free


  PID USERNAME         THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
 1739 root               1  76    0   125M   110M RUN    4556.7 45.17% mib2d
33620 root               1  76    0   132M 11052K RUN      0:15 11.74% mgd

 

From the above outputs, I see that mib2d is utilizing 45.17 % and total free processes are around 5600M[Swap+Free].

 

 In Routing-Engine, i see there is 0% Idle memory:

 

    CPU utilization:
      User                      74 percent
      Background                 0 percent
      Kernel                    22 percent
      Interrupt                  4 percent
      Idle                       0 percent

Check the below Troubleshooting Checklist - Routing Engine High CPU:

https://kb.juniper.net/InfoCenter/index?page=content&id=KB26261 

 

I see there is SNMP configured on the device and that may consume CPU for its process.

 

Can you disable SNMP monitoring and manually check the CPU usage for a while?

 

Bulk SNMP walk is bound to increase CPU on this EX.  And if we poll for a lot of data in a short time, it could spike "mgd" or CLI process as too.  It's better to only probe for critical alarms/events like interface downs, chassis/system alarms, etc., work out a lesser aggressive polling interval (if you poll every 5mins, try 10mins for example) and limit to a lesser no. of commands per poll rather than bulk.  Please check this out: 

https://kb.juniper.net/InfoCenter/index?page=content&id=KB30713&cat=EX3300&actp=LIST

 

The best to do is stay on the recommended Junos to avoid any others, but I believe for CPU utilization spike, that's something to expect during a bulk SNMP walk.  If limited polling is done for what's critical, we must be alright:

https://kb.juniper.net/InfoCenter/index?page=content&id=KB21476&actp=METADATA#ex_series

 

I hope this helps. Please mark this post "Accept as solution" if this answers your query.

 

Kudos are always appreciated! Smiley Happy

 

Best Regards,

Lingabasappa H

Highlighted
Junos

Re: High routing engine CPU because of snmp / mib2d process

‎06-05-2020 04:48 AM

Hi Beeelzebub,

 

Greetings !!

If there is Bulk request sent and recieved via agent and NMS  there could be possible high CPU Spike observed on the device 

you can check all request sent and recived in the logs if the Traceoptions are enabled under SNMP

You can check the below docs 

show log snmpd

https://kb.juniper.net/InfoCenter/index?page=content&id=KB30713&cat=EX3300&actp=LIST

 

Menahile from the Output i can see mib2d is utilized around 45.17%

beelze@ams-nik-er2> show system processes extensive | except 0.00    
last pid: 33774;  load averages:  4.08,  3.79,  3.53  up 762+06:50:43    11:01:54
165 processes: 8 running, 129 sleeping, 28 waiting

Mem: 1413M Active, 168M Inact, 227M Wired, 63M Cache, 112M Buf, 111M Free
Swap: 2821M Total, 2821M Free


  PID USERNAME         THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
 1739 root               1  76    0   125M   110M RUN    4556.7 45.17% mib2d

 check for below troubleshooting steps 

https://kb.juniper.net/InfoCenter/index?page=content&id=KB26261 

Menahile Kindly disable  the SNMP Monitroing and check for the Cpu observation for a while 

 

 

If this solves your problem, please mark this post as "Accepted Solution".
If you think that my answer was helpful, please spend some Kudos.

 

Regards,

Deeksha P

deeksha
Feedback