SRX

last person joined: 18 hours ago 

Ask questions and share experiences about the SRX Series, vSRX, and cSRX.
Expand all | Collapse all

SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

  • 1.  SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-16-2014 14:08

    Hi, all,

     

    We have been using SSG5 to build IPsec tunnels to our HQ for our data center servers IPMI access, occasionaly we will need image servers over IPsec VPN, we never had any problem with SSG5. Now that SSG series is EoS and SRX series seem to have far better performance than SSGs. So for a brand new data center we purchased SRX210H to replace SSG5, we had no problem to bring up IPsec tunnels and run BGP to head end ISG 2000, but when we push production traffic through SRX210 (the IPsec VPN traffic rate is far less than 85Mbps specd), SRX210H will have problem to keep BGP sessions up due to hold time expiration, and the follow message keeps poping up:

     

    PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=97

     

    I opened JTAC case, JTAC engineer suggested we upgrade the software to 12.1X44-D35.5, but that did not resolve the problem, as long as there is over 20Mbps IPsec traffic, BGP session over IPsec tunnel will flap.

     

    Based on your experience, could this be a simple configuration issue or SRX210 can not handle the same rate of IPsec throughput as an ancient SSG5 can do?



  • 2.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-16-2014 20:11

    Hi oldcreek,

     

    PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=97

     

    This indicates that Dataplane CPU ( Traffic ) is very high.

     

    These alerts will be seen when any of the ALG resources are used up.

     

    Check any ALG is unwantedly  triggered for the vpn traffic .

     

    BGP going down could be because of 2 things:

     

    1. High Control Plane CPU ( show chassis routing-engine)

    2. Packet drops across VPN tunnel

     

    Try changing the MTU settings  and check the status;

     

        flow {
            tcp-mss {
                ipsec-vpn {
                    mss 1420;
                }
            }
        }
    }

    As per datasheet , VPN throughput (large Packets ) is 85Mbps.

     

    http://www.juniper.net/us/en/local/pdf/datasheets/1000281-en.pdf

     

    Regards
    rparthi
     

    Please Mark My Solution Accepted if it Helped, Kudos are Appreciated Too



  • 3.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-16-2014 21:29

    Hi, rparthi,

     

    Thank you for your reply, JTAC escalation has concluded that this is SRX210's limitation, there is no workaround, we will switch back to SSG5.



  • 4.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-16-2014 22:09

    Hi ,

     

    may i have the JTAC case number for reference.

     

    Regards,

    rparthi



  • 5.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-16-2014 22:29

    2014-0814-0787

     

    Please do let me know if you have different opinion, we do want to move to SRX platform.



  • 6.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-16-2014 22:46

    Hi ,

     

    I went through the case.

    it looks like SRX is overwhelmed with huge packets per second rate.

     

    is it legimate vpn traffic ? did you capture the packets and confirmed whether these are legimate traffic causing high data plane cpu?

     

    You could have tried static routing and tested it......

     

     

    if the traffic are legitimate , then engineer analysis holds good,.....

     

    Regards
    rparthi
     

    Please Mark My Solution Accepted if it Helped, Kudos are Appreciated Too



  • 7.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-16-2014 23:01

    Hi oldcreek,

     

    What type of VPN are you using, e.g SHA/AES, route based/policy etc? Also are you using the same encryption as you were on the SSG5 vs the SRX?



  • 8.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-16-2014 23:29

    ESP, 3DES/SHA, nothing out of ordinary, since I am running BGP on top of it, it has to be route-based VPN, the encryption proposals are set in head end ISG2000, so yes, SSG5 and SRX are using the same encryption.



  • 9.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-16-2014 23:40

    Of course those are legitimate VPN traffic, maybe SRX210 is having difficulties to keep up, but it should always prioritize control traffic and selectively drop data traffic, right? dropping data traffic will make application slow, dropping control traffic will cause outage.

     

    During the debugging with JTAC, I pumped same amount/rate of traffic over SSG5 (by simultaniously copying a large core dump file from one same HQ server to 4 servers behind SSG5 in a nearby data center), SSG5 admin access becomes sluggish but BGP sessions never dropped, and transfer rate is much better than SRX210, where SRX210 would keep up for about a minute or so then totally stopped transmit or receive, CPU utils went to zero, but could not ping tunnel's other side IP address, at this point, manually clear ipsec sa will bring the network back, otherwise it will recover by itself for a longer period of time -- I failed to understand why you bring up static routing could be solution here, not to metion that there is absolutely impossible for us to use static routes to manage the network.



  • 10.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-17-2014 00:17

    Thanks for the info.

     

    Is it actually the original 210H or the newer 210HE or even 210H2 model?



  • 11.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-17-2014 00:22

    RE-SRX210HE2

    2x GE, 6x FE, 1x 3G



  • 12.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-17-2014 03:11

    Okay so I just setup the following in my lab:

     

    - SSG 5 256MB ScreenOS 6.3

    - SRX 100B 512MB JunOS 12.1X46 (the SRX210HE2 has a faster CPU and more memory so should be even quicker).

    - Another BGP Router

     

    I connected the SSG and SRX directly via 100mbit and then setup a route based 3DES/SHA1 ipsec VPN between the two.

     

    I then connected both the SSG and the SRX (using a different port on both SSG and SRX) to another router and setup some BGP sessions. Both SSG and SRX received about 300 routes via BGP.

     

    I then ran a secure copy (scp) between my MacBook Air on the SSG5 to a low end asus atom box on the SRX100 side. I copied a couple of different iso files (so large files), about 5GB in total.


    I was able to just hit 7megabytes/sec between the two machines via the IPsec VPN.

     

    While testing I watched my external BGP router and both SSG and SRX kept the BGP session going without issues.

     

    I also checked the CPU load on both devices.

     

    The SSG 5 was running at 87% CPU (get perf cpu)

    , the command line interface was slow and the status light no longer flashed evenly. The device was only just keeping up.

     

    The SRX 100 was running at 73% CPU (show chassis forwarding) (Real-time threads CPU utilization). So almost 15% less, the command line interface was also responsive and worked correctly.

     

    Therefore I believe there is probably a configuration issue or other bug you are hitting, you should easily be able to push 20mbit/sec of image files through the SRX.

     

    Can you please post your SRX configuration for us to look at? Also check to ensure that you are getting full duplex on your ethernet connection.

     

    Thanks!



  • 13.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-17-2014 08:34
      |   view attached

    Hi,

     

    Thank you so much for spending time on this issue, I also would like to believe there is something wrong with my configure that JTAC engineers might've neglected, configuration is attached, the other side is Netscreen ISG2000, which has 50 or so route-based IPsec VPN tunnels configured, and I have no reason to suspect ISG2000 side configuration because ISG2000 side configuration does not treat this SRX peer differently.

     

    All interfaces are in full-deplux, uplink to Internet is ge-0/0/0, down links are fe-0/0/2 and fe-0/0/03

     

    My setup is a little different from yours, my BGP sessions are running over IPsec tunnel bound st0 interface, not over a seperate interface, regarding traffic rate, I did not collect the real time traffic information, all I know is for the same amount of scp traffic, SSG5 was able to sustain higher transfer rate than SRX210 and never break the network under the load.

    Attachment(s)

    txt
    nyj-srx.txt   14 KB 1 version


  • 14.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-17-2014 11:42
    I see you're running a cluster with what looks like a gig module installed? I wonder if that has something todo with it.

    I doubt BGP is the cause unless you have thousands of routes.

    Things to do/test:
    1) Ensure 1350 MTU is set on both sides of the VPN
    2) Please disable the existing trace and logging you have enabled. These will slow the device down a lot and should only be on for debugging.
    3) After the above try disabling the cluster.


  • 15.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-17-2014 11:53

    Thanks, if the problem is on forwarding, I am not sure how changing MTU will have any effect, but I will try, trace is disabled. Although it is a cluster configuration, the standby is not actually online.  Do you see anything else that could be wrong in the configuration?



  • 16.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-17-2014 15:37

    Sorry I missed the line: deactivate security flow traceoptions

     

    It looks like a pretty basic config, MTU is really something you should test though.

     

    Nothing is jumping out at me as anything that should cause your issue.

     

    If I get a chance in the next couple of days I will load up your configuration on my device and test it.

     

    You can also check: show system processes extensive



  • 17.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

     
    Posted 08-17-2014 16:47
    You may also wish to try removing the IPSec vpn-monitor options, in case cpu or circuit saturation is causing it to tear down the tunnel.


  • 18.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-17-2014 19:42

    JTAC suggested add VPN monitor option such that when traffic stalled IPsec SA can be re-negotiated faster.



  • 19.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-19-2014 18:17

    Don't mean to hijack your thread but I've had a case (2012-1129-0977) that was open for 10 months back in Nov 2012 with a very similar issue (a flood of PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED messages).  Engineers were on the box for hours taking logs and checking the CPU, etc.  Eventually, they closed the case and suggested I upgrade to 12.1X44.  I just recently upgraded to this version and sure enough the problem is still there.  We do not run BGP, just a simple IPSec tunnel that hits about 40Mbs between 2 SRX220's replicating a storage array.  It seemed to all start when we added a firewall policer, but even after removing it we were still getting those messages.  Hope you find a fix and please post if any updates come along.



  • 20.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-19-2014 21:59

    Not at all.  We spent another solid four hours with JTAC because we experienced the same issue with a SRX210 we recently deployed in a remote office (again we used to use SSGs for siminar remote offices, never had a single issue), this remote office's uplink is cap'd at 10Mb/s, yet we lost connection out of nowhere during business hours.

     

    I don't think the problem has anything to do with BGP,  when we send (via scp) a large core dump file over IPsec from a server behind ISG2000 (the hub VPN gateway where all remote offices and remote data centers management network connect to) to another server behind SRX210, the transfer rate will start at around 1.5Mb/s, then continuously slow down until the transfer totally stopped, at which point, IP connectivity between the two servers were completely lost and then recover by itself after 200 pings (~400 seconds). Manually clear ipsec sa will recover the connection immediately. It seems to me that encryption/decryption engine on SRX210  somehow just stopped working under small load of traffic. Between exactly the same two servers, if we scp the file over public internet, the transfer rate is over 20Mb/s, so there is nothing wrong with the two servers.

     

     

     



  • 21.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-19-2014 23:10

    Hi oldcreek,

     

    I haven't yet had a chance to test your config, but I should this week.

     

    Are you able to disable the firewall filters to test?

     

    Thanks!



  • 22.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-20-2014 10:35

    Yes, tried that, no difference.



  • 23.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-23-2014 13:13

    Hi,

     

    I just checked, I have similar issue on SRX210H

     

    JUNOS 12.1X46-D20.5 built 2014-05-14 20:00:03 UTC

     

    Routing Engine status:
    Temperature 57 degrees C / 134 degrees F
    Total memory 1024 MB Max 696 MB used ( 68 percent)
    Control plane memory 544 MB Max 479 MB used ( 88 percent)
    Data plane memory 480 MB Max 221 MB used ( 46 percent)
    CPU utilization:
    User 19 percent
    Background 0 percent
    Kernel 14 percent
    Interrupt 0 percent
    Idle 67 percent
    Model RE-SRX210H
    Serial ID X
    Start time 2014-07-01 21:59:17 CEST
    Uptime 53 days, 11 minutes, 47 seconds
    Last reboot reason 0x200:normal shutdown
    Load averages: 1 minute 5 minute 15 minute
    0.47 0.42 0.33

     

    Max is ~20-25Mbit/s over IPsec and it's makes:

     

    Aug 23 22:03:46 srx210 PERF_MON: RTPERF_CPU_USAGE_OK: FPC 0 PIC 0 CPU utilization returns to normal, current value=20
    Aug 23 22:03:48 srx210 PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=100
    Aug 23 22:04:00 srx210 PERF_MON: RTPERF_CPU_USAGE_OK: FPC 0 PIC 0 CPU utilization returns to normal, current value=66
    Aug 23 22:04:02 srx210 PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=100
    Aug 23 22:04:31 srx210 PERF_MON: RTPERF_CPU_USAGE_OK: FPC 0 PIC 0 CPU utilization returns to normal, current value=6
    Aug 23 22:04:33 srx210 PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=100
    Aug 23 22:05:03 srx210 PERF_MON: RTPERF_CPU_USAGE_OK: FPC 0 PIC 0 CPU utilization returns to normal, current value=30
    Aug 23 22:05:05 srx210 PERF_MON: RTPERF_CPU_THRESHOLD_EXCEEDED: FPC 0 PIC 0 CPU utilization exceeds threshold, current value=100

     

    Traffic ends and then I start traffic without IPsec, the same hosts - ~50Mbit (as my WLAN can't handle more)

     

    Aug 23 22:05:33 srx210 PERF_MON: RTPERF_CPU_USAGE_OK: FPC 0 PIC 0 CPU utilization returns to normal, current value=27
    Aug 23 22:05:59 srx210 snmpd[1377]: SNMPD_HEALTH_MON_THRESH_CROSS: : Health Monitor: FWDD Real-Time threads total CPU Utilization crossed falling threshold 80 (value: 10), (variable: jnxFwddRtThreadsCPUUsage.0)

     

    I wondering, what we can do with it? 

     

    Regards,



  • 24.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-24-2014 17:10

    The IPsec througput you observed was was about the same as we saw (or better), our problem is not just CPU_THRESHOLD_EXCEEDED message,  our problem is that after a while SRX210 will totally stop forwarding IPsec traffic. It is a very frustrating experience that the highest level JTAC engineers have no slightest clue of what is going on so far, we don't even care about the 85Mbps IPsec throughput data Juniper put on SRX210's spec sheet, we just need a stable IPsec tunnel that does not stop forwarding out of nowhere.

     

    This is driving me crazy, I would like to scrap all SRXs (210, 220, 240) and move back to SSG platforms now, but I am also trying to peruade myself that SSG platform is going to be EoL,  there must be something wrong with my configuration and JTAC can fix it, it has been over 10 days, hundreds of megs of debug collected, we are suffering from this problem every day.

     

    What would you guys do if you were in my shoes? give up and move back to SSG platform? or have faith in JTAC and continue suffering until the problem is fixed?



  • 25.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 08-24-2014 18:45

    Hey,

     

    If you want I would be happy to SSH in and take a look myself. No idea why JTAC cannot solve the issue though.

     

    Email me at mdale@dalegroup.net if you're interested.

     

    Michael.



  • 26.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED
    Best Answer

    Posted 08-25-2014 20:37

    Problem resolved, we gave up and switched back to SSG platform. We will think about long term strategy later, most probably an alternative platform, dynamic spoke to spoke communication has been a pain point for us so far.

     

    Thank you guys for your attention, I feel much relieved now.



  • 27.  RE: SSG5 vs SRX210H IPsec throughput performance, RTPERF_CPU_THRESHOLD_EXCEEDED

    Posted 09-22-2016 21:33

    Even now, in 2016, SRX suffer poor performance with ipsec vpn.  Ive had throughput issues on srx 220, 240, and 650. Even setting the recommended mtu and mss sizes does not always fix the issue. Throughput in one direction is the expected 85% of line speed. The opposite direction is 10%.