Hey community,
we have a non-mixed VC with 5x EX4300 series switches. Since roughly 20 days the cpu usage is higher than it should be. In this timeframe we had a migration from one NetApp to another and we thought this would cause the additional utilization.
Unfortunately the migration is completed and the high cpu usage is still there.
After some research i found the following KBs:
https://kb.juniper.net/InfoCenter/index?page=content&id=KB31605&cat=&actp=LIST
https://kb.juniper.net/InfoCenter/index?page=content&id=KB26261
All the VC members are connected via DAC cables and there are currently only 3 SFP (2x 10g, 1x 1g) connections on the EX4300-32f.
Here are some informations about the REs:
root@vx-core-b6-01> show chassis routing-engine
Routing Engine status:
Slot 0:
Current state Master
Temperature 63 degrees C / 145 degrees F
CPU temperature 63 degrees C / 145 degrees F
DRAM 2048 MB
Memory utilization 70 percent
5 sec CPU utilization:
User 65 percent
Background 0 percent
Kernel 27 percent
Interrupt 8 percent
Idle 0 percent
Model EX4300-48T
Serial ID PE3717480060
Start time 2019-04-15 06:16:00 UTC
Uptime 73 days, 1 hour, 57 minutes, 26 seconds
Last reboot reason Router rebooted after a normal shutdown.
Load averages: 1 minute 5 minute 15 minute
1.55 1.52 1.25
Routing Engine status:
Slot 1:
Current state Backup
Temperature 56 degrees C / 132 degrees F
CPU temperature 56 degrees C / 132 degrees F
DRAM 2048 MB
Memory utilization 51 percent
5 sec CPU utilization:
User 15 percent
Background 0 percent
Kernel 13 percent
Interrupt 0 percent
Idle 72 percent
Model EX4300-48T
Serial ID PE3717480078
Start time 2019-04-15 06:16:00 UTC
Uptime 73 days, 1 hour, 57 minutes, 26 seconds
Last reboot reason Router rebooted after a normal shutdown.
Load averages: 1 minute 5 minute 15 minute
0.52 0.41 0.35
After checking the usage of the individual processes via "top", i got the following output:
last pid: 72354; load averages: 1.11, 0.97, 0.99 up 73+01:55:52 08:11:22
71 processes: 2 running, 68 sleeping, 1 zombie
CPU states: 34.6% user, 0.0% nice, 20.0% system, 4.0% interrupt, 41.4% idle
Mem: 1070M Active, 88M Inact, 144M Wired, 324M Cache, 112M Buf, 235M Free
Swap:
PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
1787 root 2 -52 -52 564M 209M select 549.0H 38.09% pfex_junos
72354 root 1 44 0 23784K 2280K RUN 0:00 3.00% top
1781 root 1 43 0 67352K 32676K select 36.9H 0.98% chassisd
1844 root 1 42 0 54488K 36532K select 20.6H 0.00% l2ald
1865 root 1 40 0 196M 99344K select 751:58 0.00% authd
1842 root 1 40 0 59072K 38708K select 552:06 0.00% mib2d
1338 root 1 41 0 28700K 14288K select 411:25 0.00% eventd
1876 root 2 4 0 14064K 3276K select 406:15 0.00% repd
1861 root 1 4 0 65700K 37552K kqread 363:04 0.00% l2cpd
1847 root 1 40 0 41232K 23148K select 349:52 0.00% pfed
1864 root 2 40 0 108M 28624K select 341:01 0.00% jdhcpd
1877 root 1 4 0 77560K 31632K kqread 294:27 0.00% mcsnoopd
1788 root 1 4 0 36024K 14444K kqread 276:43 0.00% chassism
1878 root 1 41 0 26456K 14864K select 246:50 0.00% license-check
1841 root 1 40 0 35760K 25056K select 177:30 0.00% snmpd
1873 root 1 4 0 75288K 31036K kqread 129:59 0.00% dot1xd-secure
1852 root 1 40 0 25448K 14652K select 113:30 0.00% ppmd
1786 root 1 4 -20 57788K 23800K kqread 93:34 0.00% vccpd
1870 root 1 40 0 21508K 8208K select 66:33 0.00% shm-rtsdbd
1867 root 1 40 0 19724K 6224K select 64:56 0.00% bdbrepd
1850 root 2 40 0 34244K 19512K select 39:06 0.00% cosd
1843 root 2 4 0 155M 58804K kqread 30:22 0.00% rpd
1836 root 1 40 0 2752K 1280K select 30:07 0.00% bslockd
1838 root 1 40 0 24552K 14424K select 27:23 0.00% alarmd
1785 root 1 40 0 12856K 5404K select 20:16 0.00% irsd
1860 root 1 40 0 28476K 15748K select 20:10 0.00% lacpd
1846 root 1 40 0 3476K 1500K select 15:32 0.00% tnp.sntpd
1869 root 1 40 0 27044K 10016K select 13:05 0.00% smid
1857 root 1 40 0 22096K 12032K select 11:19 0.00% bfdd
1889 root 1 40 0 64788K 24164K select 10:39 0.00% dcd
1882 root 1 40 0 23152K 12384K select 10:18 0.00% pkid
1863 root 1 40 0 22992K 12980K select 9:52 0.00% lfmd
1862 root 1 40 0 21740K 11848K select 5:33 0.00% oamd
1840 root 1 40 0 9792K 8844K select 5:07 0.00% ntpd
1851 root 1 40 0 35036K 16556K select 4:59 0.00% kmd
1859 root 1 40 0 61360K 18220K select 4:50 0.00% dfcd
1849 root 1 40 0 28256K 14760K select 4:16 0.00% rmopd
1848 root 1 40 0 21432K 11936K select 3:59 0.00% ilmid
1845 root 1 25 -15 21572K 11832K select 3:54 0.00% apsd
1839 root 1 40 0 21436K 11088K select 3:51 0.00% craftd
1853 root 1 40 0 13228K 5644K select 2:56 0.00% lmpd
1783 root 1 42 0 7168K 2596K select 2:06 0.00% inetd
1782 root 1 40 0 58424K 28672K select 1:44 0.00% mgd
2141 root 1 40 0 74804K 41744K select 1:03 0.00% cli
1855 root 1 40 0 42212K 25888K select 0:59 0.00% dfwd
1623 root 1 8 0 2324K 684K nanslp 0:22 0.00% cron
2152 root 1 40 0 59232K 13352K select 0:11 0.00% mgd
1789 root 1 8 0 2724K 1292K nanslp 0:10 0.00% getty
1790 root 1 8 0 2724K 1304K nanslp 0:10 0.00% getty
1835 root 1 40 0 2328K 952K select 0:08 0.00% watchdog
1858 root 1 40 0 16336K 6376K select 0:06 0.00% rdd
1911 root 1 40 0 2596K 1412K select 0:02 0.00% rlogind
1910 root 1 5 0 2400K 1276K ttyin 0:01 0.00% rlogin
71957 root 1 40 0 11692K 5540K select 0:00 0.00% sshd
1913 root 1 8 0 8908K 3336K wait 0:00 0.00% login
1868 root 1 4 0 14476K 5272K select 0:00 0.00% sendd
1866 root 1 4 0 47956K 5176K select 0:00 0.00% mplsoamd
71964 root 1 40 0 4888K 3180K pause 0:00 0.00% csh
1914 root 1 4 0 2400K 300K sbwait 0:00 0.00% rlogin
In addition to that top output:
root@vx-core-b6-01> show system processes extensive
last pid: 72363; load averages: 1.58, 1.15, 1.06 up 73+01:56:56 08:12:26
180 processes: 6 running, 152 sleeping, 22 waiting
Mem: 1101M Active, 87M Inact, 145M Wired, 324M Cache, 112M Buf, 203M Free
Swap:
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND
1787 root 76 0 564M 209M RUN 549.0H 39.01% pfex_junos
1787 root -52 -52 564M 209M ll_blo 549.0H 29.44% pfex_junos
1844 root 72 0 54488K 36532K RUN 20.6H 5.18% l2ald
45 root 12 0 0K 16K sleep 44.5H 4.25% netdaemon
10 root 155 52 0K 16K RUN 895.6H 3.37% idle
72358 root 41 0 74716K 41332K select 0:01 2.25% cli
35 root -60 -163 0K 16K WAIT 26.9H 1.76% swi1: ipfwd
1781 root 48 0 67352K 32676K RUN 36.9H 1.07% chassisd
11 root -56 -159 0K 16K ll_blo 17.8H 0.93% swi2: netisr 0
2 root -16 0 0K 16K jfe_jo 973:19 0.00% jfe_job_0_0
1865 root 40 0 196M 99344K select 751:59 0.00% authd
2135 root 4 0 0K 16K select 714:08 0.00% ppt_07_80000005
1842 root 41 0 59072K 38708K select 552:06 0.00% mib2d
23 root -68 -171 0K 16K WAIT 516:06 0.00% irq41: besw0
12 root -36 -139 0K 16K RUN 476:38 0.00% swi7: clock
1338 root 41 0 28700K 14288K select 411:26 0.00% eventd
1876 root 40 0 14064K 3276K select 406:15 0.00% repd
1876 root 4 0 14064K 3276K select 406:15 0.00% repd
1861 root 4 0 65700K 37552K kqread 363:05 0.00% l2cpd
1847 root 40 0 41232K 23148K select 349:52 0.00% pfed
1864 root 40 0 108M 28624K select 341:01 0.00% jdhcpd
1864 root 40 0 108M 28624K select 341:01 0.00% jdhcpd
1877 root 4 0 77560K 31632K kqread 294:27 0.00% mcsnoopd
1788 root 4 0 36024K 14444K kqread 276:43 0.00% chassism
1878 root 42 0 26456K 14864K select 246:50 0.00% license-check
1841 root 40 0 35772K 25068K select 177:30 0.00% snmpd
14 root -16 0 0K 16K - 149:52 0.00% rand_harvestq
8 root 8 0 0K 16K - 133:23 0.00% thread taskq
1873 root 4 0 75288K 31036K kqread 129:59 0.00% dot1xd-secure
1852 root 40 0 25448K 14652K select 113:30 0.00% ppmd
1918 root 4 0 0K 16K select 106:36 0.00% ppt_11_80000013
36 root -16 0 0K 16K client 101:42 0.00% ifstate notify
1919 root 4 0 0K 16K select 96:56 0.00% ppt_11_80000014
1786 root 4 -20 57788K 23800K kqread 93:34 0.00% vccpd
1940 root 4 0 0K 16K select 91:50 0.00% ppt_11_80000010
1922 root 4 0 0K 16K select 79:59 0.00% ppt_11_80000011
1921 root 4 0 0K 16K select 74:58 0.00% ppt_11_80000012
1870 root 40 0 21508K 8208K select 66:33 0.00% shm-rtsdbd
1867 root 40 0 19724K 6224K select 64:56 0.00% bdbrepd
1850 root 40 0 34244K 19512K select 39:06 0.00% cosd
1850 root 40 0 34244K 19512K select 39:06 0.00% cosd
3 root -16 0 0K 16K jfe_jo 35:18 0.00% jfe_job_1_0
1843 root 4 0 155M 58804K kqread 30:22 0.00% rpd
1843 root 4 0 155M 58804K kqread 30:22 0.00% rpd
1836 root 40 0 2752K 1280K select 30:07 0.00% bslockd
1838 root 40 0 24552K 14424K select 27:23 0.00% alarmd
1785 root 40 0 12856K 5404K select 20:16 0.00% irsd
1860 root 40 0 28476K 15748K select 20:10 0.00% lacpd
1846 root 40 0 3476K 1500K select 15:32 0.00% tnp.sntpd
63 root 12 0 0K 16K - 14:29 0.00% schedcpu
41 root -4 0 0K 16K syncer 14:13 0.00% syncer
38 root 155 52 0K 16K pgzero 13:54 0.00% pagezero
1869 root 40 0 27044K 10016K select 13:05 0.00% smid
42 root 40 0 0K 16K vnlrum 11:59 0.00% vnlru_mem
1857 root 40 0 22096K 12032K select 11:19 0.00% bfdd
1889 root 40 0 64788K 24164K select 10:39 0.00% dcd
1882 root 40 0 23152K 12384K select 10:18 0.00% pkid
1863 root 40 0 22992K 12980K select 9:52 0.00% lfmd
57 root -16 0 0K 16K psleep 8:16 0.00% vmkmemdaemon
4 root -8 0 0K 16K - 5:53 0.00% g_event
5 root -8 0 0K 16K - 5:48 0.00% g_up
1862 root 40 0 21740K 11848K select 5:33 0.00% oamd
1840 root 40 0 9792K 8844K select 5:07 0.00% ntpd
1851 root 40 0 35036K 16556K select 4:59 0.00% kmd
6 root -8 0 0K 16K - 4:54 0.00% g_down
1859 root 40 0 61360K 18220K select 4:50 0.00% dfcd
1849 root 40 0 28256K 14760K select 4:16 0.00% rmopd
1848 root 40 0 21432K 11936K select 3:59 0.00% ilmid
1845 root 25 -15 21572K 11832K select 3:54 0.00% apsd
1839 root 40 0 21436K 11088K select 3:51 0.00% craftd
43 root -16 0 0K 16K sdflus 3:41 0.00% softdepflush
27 root -84 -187 0K 16K WAIT 3:28 0.00% irq96: fman0
21 root -68 -171 0K 16K WAIT 3:01 0.00% irq38: i2c0 i2c1
1853 root 40 0 13228K 5644K select 2:56 0.00% lmpd
1783 root 52 0 7168K 2596K select 2:06 0.00% inetd
9 root 8 0 0K 16K - 1:57 0.00% kqueue taskq
1782 root 40 0 58424K 28672K select 1:44 0.00% mgd
54 root 40 0 0K 16K select 1:38 0.00% jsr_kkcm
50 root 4 0 0K 16K select 1:34 0.00% devrt_kthread
28 root -80 -183 0K 16K WAIT 1:24 0.00% irq44: ehci0
34 root -52 -155 0K 16K WAIT 1:07 0.00% swi3: ip6opt ipopt
2141 root 40 0 74804K 41744K select 1:03 0.00% cli
1855 root 40 0 42212K 25888K select 0:59 0.00% dfwd
19 root -44 -147 0K 16K WAIT 0:52 0.00% swi5: cambio
40 root -4 0 0K 16K vlruwt 0:34 0.00% vnlru
1 root 8 0 1632K 736K wait 0:33 0.00% init
53 root -16 0 0K 16K psleep 0:32 0.00% vmuncachedaemon
39 root -16 0 0K 16K psleep 0:31 0.00% bufdaemon
1623 root 8 0 2324K 684K nanslp 0:22 0.00% cron
64 root 12 0 0K 16K no_rs 0:22 0.00% rtimeshare_thr
65 root -8 0 0K 16K mdwait 0:17 0.00% md0
490 root -8 0 0K 16K mdwait 0:12 0.00% md22
2152 root 40 0 59232K 13352K select 0:11 0.00% mgd
1789 root 8 0 2724K 1292K nanslp 0:10 0.00% getty
1790 root 8 0 2724K 1304K nanslp 0:10 0.00% getty
327 root -8 0 0K 16K mdwait 0:08 0.00% md14
1835 root 40 0 2328K 952K select 0:08 0.00% watchdog
37 root -16 0 0K 16K psleep 0:07 0.00% pagedaemon
1823 root 4 0 0K 16K select 0:07 0.00% ppt_11_80000010
1858 root 40 0 16336K 6376K select 0:06 0.00% rdd
58 root 8 0 0K 16K ifscli 0:03 0.00% ifsclientclosed
1911 root 40 0 2596K 1412K select 0:02 0.00% rlogind
356 root -8 0 0K 16K mdwait 0:02 0.00% md16
1910 root 5 0 2400K 1276K ttyin 0:01 0.00% rlogin
182 root -8 0 0K 16K mdwait 0:01 0.00% md4
32 root -8 0 0K 16K usbevt 0:01 0.00% usb1
29 root -8 0 0K 16K usbevt 0:01 0.00% usb0
153 root -8 0 0K 16K mdwait 0:00 0.00% md2
454 root -8 0 0K 16K mdwait 0:00 0.00% md21
5736 root -8 0 0K 16K mdwait 0:00 0.00% md34
641 root -8 0 0K 16K mdwait 0:00 0.00% md29
317 root -8 0 0K 16K mdwait 0:00 0.00% md13
22 root -64 -167 0K 16K WAIT 0:00 0.00% swi0: uart uart
71957 root 40 0 11704K 5552K select 0:00 0.00% sshd
269 root -8 0 0K 16K mdwait 0:00 0.00% md10
1913 root 8 0 8908K 3336K wait 0:00 0.00% login
211 root -8 0 0K 16K mdwait 0:00 0.00% md6
1868 root 4 0 14476K 5272K select 0:00 0.00% sendd
1866 root 4 0 47956K 5176K select 0:00 0.00% mplsoamd
72359 root 41 0 58464K 6160K select 0:00 0.00% mgd
71964 root 40 0 4888K 3180K pause 0:00 0.00% csh
1914 root 4 0 2400K 300K sbwait 0:00 0.00% rlogin
2103 root 64 0 5120K 3056K pause 0:00 0.00% csh
1837 root 4 0 4812K 2020K select 0:00 0.00% tnetd
415 root -8 0 0K 16K mdwait 0:00 0.00% md20
1856 root 4 0 8332K 2752K select 0:00 0.00% rtspd
1791 root 8 0 2956K 1492K wait 0:00 0.00% sh
346 root -8 0 0K 16K mdwait 0:00 0.00% md15
654 root -8 0 0K 16K mdwait 0:00 0.00% md30
0 root 12 0 0K 0K WAIT 0:00 0.00% swapper
5726 root -8 0 0K 16K mdwait 0:00 0.00% md33
628 root -8 0 0K 16K mdwait 0:00 0.00% md28
386 root -8 0 0K 16K mdwait 0:00 0.00% md18
288 root -8 0 0K 16K mdwait 0:00 0.00% md11
72363 root 44 0 23560K 1944K RUN 0:00 0.00% top
240 root -8 0 0K 16K mdwait 0:00 0.00% md8
589 root -8 0 0K 16K mdwait 0:00 0.00% md25
376 root -8 0 0K 16K mdwait 0:00 0.00% md17
667 root -8 0 0K 16K mdwait 0:00 0.00% md31
3223 root 40 0 58432K 2804K select 0:00 0.00% mgd
201 root -8 0 0K 16K mdwait 0:00 0.00% md5
405 root -8 0 0K 16K mdwait 0:00 0.00% md19
680 root -8 0 0K 16K mdwait 0:00 0.00% md32
602 root -8 0 0K 16K mdwait 0:00 0.00% md26
230 root -8 0 0K 16K mdwait 0:00 0.00% md7
1854 root 8 0 2400K 1228K nanslp 0:00 0.00% smartd
563 root -8 0 0K 16K mdwait 0:00 0.00% md23
143 root -8 0 0K 16K mdwait 0:00 0.00% md1
576 root -8 0 0K 16K mdwait 0:00 0.00% md24
172 root -8 0 0K 16K mdwait 0:00 0.00% md3
615 root -8 0 0K 16K mdwait 0:00 0.00% md27
259 root -8 0 0K 16K mdwait 0:00 0.00% md9
298 root -8 0 0K 16K mdwait 0:00 0.00% md12
51 root 4 0 0K 16K select 0:00 0.00% if_pic_listen0
56 root 12 0 0K 16K condsl 0:00 0.00% delayedexecd
31 root -80 -183 0K 16K WAIT 0:00 0.00% irq45: ehci1
52 root 4 0 0K 16K purge_ 0:00 0.00% kern_pir_proc
25 root -84 -187 0K 16K WAIT 0:00 0.00% irq16: bman0 qman0+
7 root 8 0 0K 16K - 0:00 0.00% mastership taskq
49 root 4 0 0K 16K pfenoc 0:00 0.00% if_pfe_listen
48 root 4 0 0K 16K dump_r 0:00 0.00% kern_dump_proc
59 root 8 0 0K 16K - 0:00 0.00% nfsiod 0
1449 root 8 0 0K 16K crypto 0:00 0.00% crypto
60 root 8 0 0K 16K - 0:00 0.00% nfsiod 1
61 root 8 0 0K 16K - 0:00 0.00% nfsiod 2
62 root 8 0 0K 16K - 0:00 0.00% nfsiod 3
1450 root 8 0 0K 16K crypto 0:00 0.00% crypto returns
55 root -20 0 0K 16K jsr_js 0:00 0.00% jsr_jsm
30 root 8 0 0K 16K usbtsk 0:00 0.00% usbtask
17 root -28 -131 0K 16K WAIT 0:00 0.00% swi9: Giant taskq
18 root -28 -131 0K 16K WAIT 0:00 0.00% swi9: task queue
16 root -32 -135 0K 16K WAIT 0:00 0.00% swi8: +
13 root -40 -143 0K 16K WAIT 0:00 0.00% swi6: vm
33 root -48 -151 0K 16K WAIT 0:00 0.00% swi4: ip6_mismatch+
44 root -56 -159 0K 16K WAIT 0:00 0.00% swi2: ndpisr-E
46 root -56 -159 0K 16K WAIT 0:00 0.00% swi2: ndpisr-I
20 root -68 -171 0K 16K WAIT 0:00 0.00% irq25: syspld0
15 root -84 -187 0K 16K WAIT 0:00 0.00% mpfe_drv_taskq16: +
24 root -84 -187 0K 16K WAIT 0:00 0.00% irq105: bman0
26 root -84 -187 0K 16K WAIT 0:00 0.00% irq104: qman0
Can someone clarify why there are two pfex_junos processes running and what the state "ll_blo" means - I found nothing clarifying that? Is there maybe a hung process, which is causing the additional utilization?
Thanks in advance,
Julian