RTPERF_CPU_THRESHOLD_EXCEEDED

last person joined: yesterday

Ask questions and share experiences about the SRX Series, vSRX, and cSRX.

Back to discussions

Expand all | Collapse all

RTPERF_CPU_THRESHOLD_EXCEEDED

1. RTPERF_CPU_THRESHOLD_EXCEEDED

0 Recommend
Erdem
Posted 09-15-2011 02:27

Reply Reply Privately
Hi,

how can I find out which process causes this problem?
Sep 15 11:15:19 firewall-master PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=99
Sep 15 11:16:03 firewall-master PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=99
Sep 15 11:16:05 firewall-master PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=91
Sep 15 11:16:22 firewall-master PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=99
Sep 15 11:16:41 firewall-master PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=90
Sep 15 11:16:44 firewall-master PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=94
Sep 15 11:16:50 firewall-master PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=99
Sep 15 11:17:26 firewall-master PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=99
Sep 15 11:20:04 firewall-master last message repeated 2 times
Sep 15 11:20:09 firewall-master PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=97
Sep 15 11:23:04 firewall-master PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=99
Sep 15 11:23:34 firewall-master last message repeated 2 times
Sep 15 11:24:15 firewall-master last message repeated 2 times
It is continuisly logged in the message log. Every 1-3 minutes.

JunOS 10.4R6.5
2. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

0 Recommend
Erdem
Posted 09-15-2011 02:41

Reply Reply Privately
user@router>set task accounting on

user@router>show task accounting

Below to show status of Routing Engine and Processes:

user@router>show chassis routing-engine
user@router>show system processes extensive

3. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

Recommend

Erdem

Posted 09-15-2011 02:51

Thank you for the fast answer:

{primary:node0}
root@firewall-master> show task accounting           
Task accounting is enabled.

Task                       Started    User Time  System Time  Longest Run
Scheduler                      171        0.019        0.050        0.000
LMP Client                      46        0.010        0.027        0.001
Memory                           4        0.000        0.000        0.000
BFD I/O./var/run/bfdd_con       18        0.002        0.002        0.000
KRT                             55        0.002        0.005        0.000
Redirect                         1            0        0.000        0.000
MGMT_Listen./var/run/rpd_        4        0.002        0.002        0.001
SNMP Subagent./var/run/sn       31        0.009        0.051        0.005

root@firewall-master> show chassis routing-engine    
node0:
--------------------------------------------------------------------------
Routing Engine status:
    Temperature                 54 degrees C / 129 degrees F
    Total memory              1024 MB Max   686 MB used ( 67 percent)
      Control plane memory     560 MB Max   370 MB used ( 66 percent)
      Data plane memory        464 MB Max   316 MB used ( 68 percent)
    CPU utilization:
      User                      10 percent
      Background                 0 percent
      Kernel                    14 percent
      Interrupt                  0 percent
      Idle                      76 percent
    Model                          RE-SRX210H
    Serial ID                      AABS9483
    Start time                     2011-09-08 00:23:50 CEST
    Uptime                         7 days, 11 hours, 24 minutes, 53 seconds
    Last reboot reason             0x1000:reboot due to panic 
    Load averages:                 1 minute   5 minute  15 minute
                                       0.14       0.21       0.27

node1:
--------------------------------------------------------------------------
Routing Engine status:
    Temperature                 54 degrees C / 129 degrees F
    Total memory              1024 MB Max   666 MB used ( 65 percent)
      Control plane memory     560 MB Max   431 MB used ( 77 percent)
      Data plane memory        464 MB Max   232 MB used ( 50 percent)
    CPU utilization:
      User                       8 percent
      Background                 0 percent
      Kernel                     8 percent
      Interrupt                  0 percent
      Idle                      84 percent
    Model                          RE-SRX210H
    Serial ID                      AACS1294
    Start time                     2011-08-26 15:30:39 CEST
    Uptime                         19 days, 20 hours, 18 minutes, 8 seconds
    Last reboot reason             0x200:chassis control reset 
    Load averages:                 1 minute   5 minute  15 minute
                                       0.05       0.17       0.15

{primary:node0}
root@firewall-master> show system processes extensive | no-more 
node0:
--------------------------------------------------------------------------
last pid: 58878;  load averages:  0.11,  0.19,  0.26  up 7+11:26:09    11:49:29
127 processes: 16 running, 100 sleeping, 11 waiting

Mem: 125M Active, 78M Inact, 579M Wired, 22M Cache, 112M Buf, 166M Free
Swap:


  PID USERNAME      THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
 1162 root            5  76    0   490M 47628K select 0 221.3H 108.94% flowd_octeon_hm
   22 root            1 171   52     0K    16K RUN    0 117.0H 65.38% idle: cpu0
58878 root            1  80    0 24092K  1464K CPU0   0   0:00  0.73% top
   24 root            1 -20 -139     0K    16K RUN    0 353:38  0.00% swi7: clock
    5 root            1 -84    0     0K    16K rtfifo 0  97:28  0.00% rtfifo_kern_recv
   23 root            1 -40 -159     0K    16K WAIT   0  90:26  0.00% swi2: net
 1399 root            1  76    0     0K    16K select 0  60:28  0.00% peer proxy
 1168 root            1  76    0  4320K   684K select 0  53:45  0.00% license-check
 1356 root            1  76    0 16248K  5064K select 0  51:09  0.00% kmd
 1385 root            1  76    0 12564K  4384K select 0  41:37  0.00% nsd
 1165 root            1  76    0  9972K  2636K select 0  30:04  0.00% jsrpd
 1181 root            1  76    0  7900K  1996K select 0  27:10  0.00% ppmd
 1167 root            1  76    0 10468K  2368K select 0  27:08  0.00% rtlogd
 1354 root            1  76    0 14324K  2904K select 0  26:43  0.00% l2ald
 1366 root            1  76    0 13088K  3112K select 0  21:36  0.00% utmd
 1359 root            3  20    0 40664K  5232K sigwai 0  17:21  0.00% authd
 1159 root            2  76    0 20116K  2700K select 0  15:57  0.00% pfed
<snip>
node1:
--------------------------------------------------------------------------
last pid: 57219;  load averages:  0.16,  0.17,  0.16  up 19+20:19:20    11:49:29
100 processes: 16 running, 73 sleeping, 11 waiting

Mem: 129M Active, 123M Inact, 588M Wired, 24M Cache, 112M Buf, 105M Free
Swap:


  PID USERNAME      THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
 1098 root            5  76    0   491M 48176K select 0 572.1H 102.25% flowd_octeon_hm
   22 root            1 171   52     0K    16K RUN    0 340.2H 76.66% idle: cpu0
   24 root            1 -20 -139     0K    16K RUN    0 730:57  0.00% swi7: clock
    5 root            1 -84    0     0K    16K rtfifo 0 250:24  0.00% rtfifo_kern_recv
   23 root            1 -40 -159     0K    16K WAIT   0 160:25  0.00% swi2: net
  114 root            1  76    0  7276K  3116K select 0 110:02  0.00% ksyncd
 1104 root            1  76    0  4220K  1192K select 0 104:08  0.00% license-check
 1101 root            1  76    0  9956K  4412K select 0  81:33  0.00% jsrpd
 1103 root            1  76    0 10076K  4392K select 0  55:19  0.00% rtlogd
19760 root            1   4    0     0K    16K proxy_ 0  51:20  0.00% peer proxy
 1095 root            2  76    0 20116K  6216K select 0  38:10  0.00% pfed
   26 root            1 -16    0     0K    16K -      0  28:34  0.00% yarrow
 1085 root            1  76    0  2624K  1244K select 0  27:07  0.00% bslockd
 1089 root            1  76    0  7044K  2768K select 0  24:48  0.00% alarmd
   48 root            1 -16    0     0K    16K psleep 0  20:43  0.00% vmkmemdaemon
    3 root            1  -8    0     0K    16K -      0   8:28  0.00% g_up
 1151 root            1   4    0     0K    16K pslave 0   8:25  0.00% peer proxy
    4 root            1  -8    0     0K    16K -      0   8:15  0.00% g_down
    2 root            1  -8    0     0K    16K -      0   8:13  0.00% g_event
 1152 root            1   4    0     0K    16K pslave 0   7:41  0.00% peer proxy
  131 root            1  76    0 30168K 14192K select 0   5:46  0.00% chassisd
  868 root            1  76    0  7140K  2832K select 0   5:44  0.00% eventd
   40 root            1  20    0     0K    16K vnlrum 0   5:38  0.00% vnlru_mem
   41 root            1  20    0     0K    16K syncer 0   5:23  0.00% syncer
 1127 root            1  76    0 15400K  4300K select 0   4:44  0.00% bdbrepd
  130 root            1  76    0 12672K  6664K select 0   4:31  0.00% snmpd
  118 root            1  76    0 14792K  6960K select 0   4:16  0.00% mib2d

The node0 is the primary for alle redundancy groups... the node1 has nothing to do, exept waiting for a failure.

4. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

0 Recommend
Erdem
Posted 09-15-2011 03:16

Reply Reply Privately
What version are u running on the SRX 210?

Also IDP, AV etc running on the box?
5. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

0 Recommend
Erdem
Posted 09-15-2011 03:23

Reply Reply Privately
Hi,

it is running JunOS 10.4R6.5

UTM is configured with feature-profiles, but no policy have an active utm-policy attribute. So at the moment it is not in use. I have deactivated all utm-policy attributes, because I hope this is the performance problem, but it is still logging this issue every 1 to 3 minutes.

IDP is running.. yes, for most of all policies. Is there a command to look if the performance issue is with IDP scanning?
6. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

0 Recommend
Erdem
Posted 01-04-2012 07:59

Reply Reply Privately
Hello!

I have the same issue with SRX240 and JunOS 10.0R3.10
No AV, IDP or AntiSpam features enabled.

show system processes extensive:
last pid: 1566; load averages: 4.03, 3.97, 3.01 up 0+00:19:17 01:44:51 116 processes: 17 running, 87 sleeping, 12 waiting Mem: 127M Active, 78M Inact, 520M Wired, 162M Cache, 112M Buf, 84M Free Swap: PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 1006 root 6 8 0 457M 41672K nanslp 0 59:43 325.54% flowd_octeon_hm 23 root 1 -40 -159 0K 16K WAIT 0 6:50 41.31% swi2: net 54 root 1 -8 0 0K 16K mdwait 0 1:39 0.00% md0 19 root 1 171 52 0K 16K RUN 3 1:33 0.00% idle: cpu3 20 root 1 171 52 0K 16K RUN 2 1:33 0.00% idle: cpu2 21 root 1 171 52 0K 16K RUN 1 1:33 0.00% idle: cpu1 22 root 1 171 52 0K 16K RUN 0 0:39 0.00% idle: cpu0 24 root 1 -20 -139 0K 16K WAIT 0 0:07 0.00% swi7: clock 1028 root 1 4 0 11240K 5728K kqread 0 0:05 0.00% eswd 1025 root 1 4 0 6772K 3292K kqread 0 0:03 0.00% mcsnoopd 995 root 1 4 0 36196K 18992K kqread 0 0:03 0.00% rpd 1040 root 1 76 0 50960K 18816K select 0 0:03 0.00% mgd 5 root 1 -84 0 0K 16K rtfifo 0 0:02 0.00% rtfifo_kern_recv
What is flowd_octeon_hm ?
7. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

0 Recommend
Erdem
Posted 01-04-2012 08:03

Reply Reply Privately
Hi,

why do you not updating to 10.4R7.5 ?

It is normal that the process flowd_octeon_hm stays at 100%. But not at 300%. And it is not normal that the unit have a load average of 4. It should be below 0.30.

Update to the newest recommended version first, I recommand.
8. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

0 Recommend
Erdem
Posted 01-09-2012 03:52

Reply Reply Privately
Hi Coolblue,

The issue here is the packet forwarding engine receiving packets more than what it can handle. Even a STP flood can cause the PFE to shoot high. You can enable firewall filter referred in kb http://kb.juniper.net/InfoCenter/index?page=content&id=KB21265&actp=search&viewlocale=en_US&searchid=1326109630860

This kb is only permitting management traffic. Since you are running IDP, you need to add another filter to permit the communication between SRX and Juniper IDP server. I think permitting port 80 and 443 should do that.

Hope this helps.

Regards,
Visitor
-------------------------------------------------------------------------------------------------------
If this post was helpful, please mark this post as an "Accepted Solution”. Kudos are always appreciated!

SRX

RTPERF_CPU_THRESHOLD_EXCEEDED

Erdem09-15-2011 02:27

Erdem09-15-2011 02:41

Erdem09-15-2011 02:51

Erdem09-15-2011 03:16

Erdem09-15-2011 03:23

Erdem01-04-2012 07:59

Erdem01-04-2012 08:03

Erdem01-09-2012 03:52

1. RTPERF_CPU_THRESHOLD_EXCEEDED

2. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

3. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

4. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

5. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

6. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

7. RE: RTPERF_CPU_THRESHOLD_EXCEEDED

8. RE: RTPERF_CPU_THRESHOLD_EXCEEDED