SRX Services Gateway
Highlighted
SRX Services Gateway

High CPU for PKID process

‎04-29-2019 07:27 AM

I am running an SRX 340 on Junos 18.4R1-S1 that constantely has high CPU usage for the PKID process. I have also seen this in version 17.4 and on vSRX running Junos 19.1. I have very little traffic going through these devices. I configured the default Root CA profiles on the the various devices. My guess is the pkid process goes to update the CRL's for the CA groups and the process hangs. The only way to get the PKID process to return to normal is to restart the pkid service. Im trying to figure out the cause of this issue or if its a bug. 

I have attached the output of the commmand show system processes extensive from both an SRX 340 running 18.4R1 and a vSRX running 19.1. Any help would be appriciated. 

 

SRX 340 running 18.4R1

PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
9318 root 123 0 1814M 1151M CPU3 3 912:19 92.48% flowd_octeon_hm
9318 root 123 0 1814M 1151M CPU1 1 912:19 92.48% flowd_octeon_hm
9318 root 123 0 1814M 1151M CPU2 2 912:19 92.48% flowd_octeon_hm
3258 root 123 0 39484K 19376K RUN 0 36.5H 68.41% pkid
9318 root 29 0 1814M 1151M RUN 0 912:19 12.01% flowd_octeon_hm
1687 root 24 0 36952K 11436K RUN 0 382:16 5.22% eventd
9318 root 20 0 1814M 1151M ucondt 0 912:19 0.00% flowd_octeon_hm
9318 root 20 0 1814M 1151M RUN 0 912:19 0.00% flowd_octeon_hm
21 root 155 52 0K 16K RUN 0 543:04 0.00% idle: cpu0
20 root 155 52 0K 16K RUN 1 77:15 0.00% idle: cpu1
19 root 155 52 0K 16K RUN 2 53:00 0.00% idle: cpu2
18 root 155 52 0K 16K RUN 3 52:48 0.00% idle: cpu3
2072 root 20 0 146M 46856K select 0 46:00 0.00% authd

 

 

VSRX running 19.1

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND
52493 root 103 0 733M 21484K RUN 52.0H 100.00% pkid
11 root 155 ki31 0K 16K RUN 94.8H 0.00% idle
13 root -60 - 0K 864K WAIT 25:09 0.00% intr{swi4: clock (0)}
7358 root 20 0 714M 3524K select 23:45 0.00% sysctlrelayd
13 root -72 - 0K 864K WAIT 17:00 0.00% intr{swi1: netisr 0}
7325 root 20 0 732M 7884K select 10:05 0.00% license-check
2 root -16 - 0K 16K jfe_jo 8:01 0.00% jfe_job_0_0
7230 root 20 0 745M 14452K select 7:28 0.00% pfed

 

7 REPLIES 7
Highlighted
SRX Services Gateway

Re: High CPU for PKID process

‎04-29-2019 08:01 AM

Hello RRiley,

 

You might be experiencing the issue reported in PR-1336733.

[ https://prsearch.juniper.net/InfoCenter/index?page=prcontent&id=PR1336733 ]

 

As per the PR: -

 

This issue might be seen if the following conditions are met:
* In IPSec scenario with pki certificates
* The intermediate CA profiles are not present on the device

 

This issue is going to be fixed in future releases.

 

Release junos
17.1R3 x
15.1X49-D150 x
17.2R3 x
18.1R3 x
18.2R2 x
18.3R1 x
18.4R1 x
17.4R2 x
17.2X75-D102 x

 

Hope this helps.

 

Thanks!

Highlighted
SRX Services Gateway

Re: High CPU for PKID process

‎04-29-2019 08:25 AM

Thank you for your reply. Im not sure if this would apply to my case sense I don't have IPSec configured. I do have SSL Forward Proxy configured with the default root CA groups loaded. So maybe it is related even though I don't specifically have IPSec configured.  Also it appears to be resolved in 18.4R1 which Im running on the SRX 340.

Highlighted
SRX Services Gateway

Re: High CPU for PKID process

‎04-30-2019 12:17 PM

Hello RRiley,

 

Despite you not using any VPN/SSL-FP, I think you are observing the issue . Besides,  18.4R1 does NOT show the issue.

 

Thanks!

Highlighted
SRX Services Gateway

Re: High CPU for PKID process

‎05-01-2019 11:37 AM

I was going to let you know what I am currently trying.

So i removed the default root ca's the Juniper device loads when you run the command:

request security pki ca-certificate ca-profile-group load ca-group-name ca-default filename default

 

Once i did that the pkid service went idle.

Since I am using SSL-FP I needed to load the root CAs again.

I have since loaded the current Firefox trusted CAs. I have not noticed the pkid process spike as of yet since the change. I am going to monitor it for a few days and see if the issue pops up again.

Highlighted
SRX Services Gateway

Re: High CPU for PKID process

‎05-01-2019 06:24 PM

Hi,

 

Since Cert verification is involved, it is quite likely we are hitting the same issue as mentioned in the PR.

 

Were you seeing the high pkid utilization without any traffic?

 

Regards,

 

Vikas

Highlighted
SRX Services Gateway

Re: High CPU for PKID process

‎05-02-2019 05:59 AM

Hi Vikas,

No I did not have traffic going through it. Maybe it is the PR that should have been fixed in 18.4 which I am running. I guess its there in 19.1 as well. I do believe it is something with certificate verification and the default Juniper root CA list has old entries compared to the current Firefox one. 

Highlighted
SRX Services Gateway
Solution
Accepted by topic author RRiley
‎05-30-2019 01:02 PM

Re: High CPU for PKID process

‎05-30-2019 01:02 PM

The PKID process has been good ever since switching to the mozilla certs. Got it from certificate bundle from

https://curl.haxx.se/docs/caextract.html

 

Feedback