SRX Services Gateway
Highlighted
SRX Services Gateway

Why traffic is very slow over ipsec

[ Edited ]
‎04-29-2019 12:25 AM

Hi all,

I am having performance issue due to Ipsec traffic. I checked the onsite devices(srx/ex switches) that are fine. Can I pls ask any idea about why traffic thru datacentre is very slow. And also how to well verify whether or not re-sizing or fregmantation are happening between 2 end over tunnel? Appreciate your help.

 

Shortly topology is:
ex2200---->SRXbranch1------>3rd party ISP(mpls)------->650SRX(high number of IPSec tunnels are being terminated here)----->core mpls network(all resources here).

 

SRXbranch>show configuration security | display set | match mss
set security flow tcp-mss all-tcp mss 1450


{master:0}
ex2200> traceroute y.y.y.y source Z.Z.Z.Z ------------>Z.Z.Z.Z is a WIFI l3 vlan interface that sits on ex2200, y.y.y.y is a remote tunnel on 650SRX.
traceroute to y.y.y.y (y.y.y.y) from Z.Z.Z.Z, 30 hops max, 40 byte packets
1 c.c.c.c (c.c.c.c) 3.888 ms 3.290 ms 3.860 ms
2 y.y.y.y(y.y.y.y) 25.513 ms 25.991 ms 37.813 ms
{master:0}
ex2200>

master:0}
ex2200> ping y.y.y.y source Z.Z.Z.Z do-not-fragment size 1473
PING y.y.y.y (y.y.y.y): 1473 data bytes
ping: sendto: Message too long
ping: sendto: Message too long
ping: sendto: Message too long
....

......
--- y.y.y.y ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss

{master:0}
ex2200>


{master:0}
ex2200> ping y.y.y.y source Z.Z.Z.Z do-not-fragment size 1472
PING y.y.y.y (y.y.y.y): 1472 data bytes
1480 bytes from y.y.y.y: icmp_seq=1 ttl=63 time=30.230 ms
1480 bytes from y.y.y.y: icmp_seq=2 ttl=63 time=29.576 ms
1480 bytes from y.y.y.y: icmp_seq=3 ttl=63 time=28.392 ms
..
...
....
--- y.y.y.y ping statistics ---
100 packets transmitted, 83 packets received, 17% packet loss
round-trip min/avg/max/stddev = 28.392/41.628/82.196/15.626 ms
{master:0}
ex2200>


{master:0}
ex2200> ping y.y.y.y source z.z.z.z size 1400
PING y.y.y.y (y.y.y.y): 1400 data bytes
1408 bytes from y.y.y.y: icmp_seq=0 ttl=63 time=42.407 ms
1408 bytes from y.y.y.y: icmp_seq=1 ttl=63 time=44.052 ms
1408 bytes from y.y.y.y: icmp_seq=2 ttl=63 time=35.322 ms
1408 bytes from y.y.y.y: icmp_seq=3 ttl=63 time=29.590 ms
.
..
...
1408 bytes from y.y.y.y: icmp_seq=47 ttl=63 time=29.715 ms
^C
--- y.y.y.y ping statistics ---
48 packets transmitted, 46 packets received, 4% packet loss
round-trip min/avg/max/stddev = 29.106/41.144/73.689/13.243 ms

{master:0}

 

Erux..

17 REPLIES 17
SRX Services Gateway

Re: Why traffic is very slow over ipsec

[ Edited ]
‎04-29-2019 10:09 AM

I'd suggest setting the mss for ipsec traffic to 1328 to account for the various sources of overhead. This will fragment larger packets prior to encryption, but should prevent fragmentation outside of the tunnel.

 

set security flow tcp-mss ipsec-vpn mss 1328

 

https://packetpushers.net/ipsec-bandwidth-overhead-using-aes/

SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎04-30-2019 12:11 PM

Hello Arix,

 

Here is a breakdown of packet size in your network shown in the post.

 

Assuming your traffic is using TCP protocol with IPv4   : -

 

TCP Header (20 bytes) + IP Header (20 bytes) + ESP Header (38 bytes) + External IPv4 header (20 bytes) + Ethernet Switching including VLAN (18 bytes) + MPLS header (4 bytes) =  120 bytes

 

In this case, an MTU of 1518 on SRX allows you to have 1398 bytes of payload.

 

Note that the SRX MTU includes Ethernet switching header whereas other devices may only calculate it without Ethernet header and hence have a lower number.

 

I would suggest you to set the MSS in the range 1350 bytes.

 

If you simply want to see if the fragmentation is occuring or not, you can do a capture before SRX and see if any of the ESP packet has "More Fragment" flags available.

 

Hopefully this helps!

 

Thanks!

SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎05-06-2019 05:01 PM

Hi all,

Thanks for reply...I have some questions relating to the same case here:

  • Why do we need to capture the packet on EX that is directly connected to srx. Why not on the SRX.
  • How can be easly captured the packet on Ex or srx? 
  • 3rd party ISP has mpls, how can we get about ISP's mss value?
  • After the ISP, how can we verify packet size, and mss size when packet arrives the other end -SRX3400 on the datacentre (End-to-End mss value verification for ipsec traffic.)
SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎05-08-2019 04:53 PM

Hi all,

Any chance to address my concern that previously I posted?

 

Really appreciate your ideas, technics, approaches..

look forward to seeing your reply..

Thanks

SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎05-10-2019 03:58 AM

Hello,

 

This is a very common issue we see with performance over IPSec VPN. I would therefore first try to set the tcp-mss value for VPN traffic as suggested by "CRM" earlier and check for any performance improvement.

 

set security flow tcp-mss ipsec-vpn mss 1328

 

Please ensure to have this set on both sides of the VPN tunnel. On the branch and the hub location.

 

Getting into packet captures can get messy and time-consuming. Fragmentation may not necessarily be happening on the firewall. Frag and de-frag anywhere along the path is a costly operation and can impact latency.

 

Regards,

 

Vikas

SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎06-05-2019 07:48 PM

Hi all,

Just following up my previous post....

 

When further delving into the case, Packet dropped and Fragment packet are rapidly increasing on branches and hub srx device. After clearing flow statistics, in 10-min-timeframe I have got the following output from one of the branches and hub srx devices. +500 sites connected to the hub over ipsec vpn. Only branches have been configured as mss 1450

Here there are two things must be concerning. From the output, one is fragment packet and the second is Packet dropped.. Are these two things are different issues or same? And also their increasement nearly same at branch site. If a packet is fregmanted, why drop happens? It must be something different? How to determine these issues?

 

Before putting mss 1328 into current configuration, I need some evidence from efficient troubleshooting that shows fragment and drop happening? And what is the impact when playing mss value start point of 1328 during the business hours?

Look forward to seeing your replies.

Note: Previously I have got your all value ideas, techniques, approaches, but this time I want to do more comprehensively.

   

Branch site:

>show security flow statistics
Current sessions: 877
Packets forwarded: 805455
Packets dropped: 18626
Fragment packets: 26961

 

set security flow tcp-mss all-tcp mss 1450

 

Hub site:

>show security flow statistics
Current sessions: 20662
Packets forwarded: 14079819
Packets dropped: 3851
Fragment packets: 258276

 

 

Thanks

Ar

SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎06-06-2019 09:42 PM

any reply from my previous post?

SRX Services Gateway

Re: Why traffic is very slow over ipsec

[ Edited ]
‎06-12-2019 12:19 AM

Arix, Can verify whether or not "replay errors" counter is incrementing via twice running the command "show security ipsec statistics"  

SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎06-12-2019 07:36 PM

Hi all

 

 I am not sure what your idea is about checking Replay errors? But I did for you.

 

>show security ipsec statistics
ESP Statistics:
   Encrypted bytes: 258828544
    Decrypted bytes: 323126770
    Encrypted packets: 842164
    Decrypted packets: 800696
AH Statistics:
    Input bytes: 0
    Output bytes: 0
    Input packets: 0
    Output packets: 0
Errors:
AH authentication failures: 0, Replay errors: 0
ESP authentication failures: 0, ESP decryption failures: 0
Bad headers: 0, Bad trailers: 0

 

My aim here is to find a SIGN/EVIDENCE from traceoptions or firewal filter's logs that says fregmentation is happening.

Recently I've done the following traceoptions on the srx box. I couldn't see any sign that says fragmentation is happening. But only saw the the following things in red color. please see.

Why can't we see a fragmentation is hapening to IPSec traffic as current mss configuration is ONLY "set security flow tcp-mss all-tcp mss 1450" as fragment packet's number from the sh sec flow statistic has been huge rapidly increasing. 

 

>show security flow statistics
      Current sessions: 225
     Packets forwarded: 14444351807
     Packets dropped: 162144762
     Fragment packets: 864461746

 

1-) Is the capturing the packet with traceoptions's location on SRX correct or it must be on Ex switch location for capturing?

2-) If we can't see fragmentation on the traceoptions files from the flow module on the srx, is the fragmentation happening before packets go the flow module? If so, where is it happening? on Physical interface? Which tool should be used for? traceoptions, firewal filter?

..

....

Jun 12 14:10:11 14:10:11.054473:CID-0:RTSmiley Tonguere-frag not needed: ipsize: 783, mtu: 9188, nsp2->pmtu: 9188

Jun 12 14:10:11 14:10:11.085986:CID-0:RTSmiley Tonguere-frag not needed: ipsize: 844, mtu: 1422, nsp2->pmtu: 1422

....

........

 

Set security flow traceoptions file Fregmentation_Check files 3 size 5m world-readable
Set security flow traceoptions flag basic-datapath
Set security flow traceoptions packet-filter packet-filter1 source-prefix 10.108.103.246
Set security flow traceoptions packet-filter packet-filter2 destination-prefix 10.108.103.246

Note: (all traffic routed to the ip address of 10.108.103.246 on SRX before goes to IPSec tunnel)

Thx.

Ar

SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎06-13-2019 11:29 PM

Hi Arix,

 

I can see that you have the following command: set security flow tcp-mss all-tcp mss 1450

 

Note that if you care about the traffic passing over the VPN then the option you need is "set security flow ipsec-vpn mss [value]". This way you only affect VPN traffic which is the one having an extra overhead due to the esp and new IP headers added.

 

               If TCP packet enters an IPsec VPN tunnel, then an ipsec-vpn mss value has high priority over all-tcp mss value, hence ipsec-vpn mss value is set.

              Ref: https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/security-...

 

Note the caveat of this command: it only affects traffic (TCP SYN messages) entering a VPN, not coming via a VPN. This is very important because in order to affect both ways traffic, we need to set the command on both ends of the tunnel.

 

Another consideration is that the regular MSS value for a TCP segment is 1460 bytes hence you are not changing much on the final packet size with a value of 1450. Please review epaniagua's explanation of MSS on the following forum:

 

           https://forums.juniper.net/t5/SRX-Services-Gateway/Site-to-Site-VPN-TCP-MMS-Issue/td-p/444842

 

1460B of Data + 20B of TCO header + 20B of IP Header = an IP packet of 1500Bytes (the common MTU value derived from the common 1460B MSS)

 

As stated in that forum, its a best practice to use MSS 1350 for VPN tunnels on both ends. Maybe you can try it and let us know the results.

 

Please mark this comment as the Solution if applicable
SRX Services Gateway

Re: Why traffic is very slow over ipsec

[ Edited ]
‎06-13-2019 11:37 PM

Regarding the ping test#1:

 

ex2200> ping y.y.y.y source Z.Z.Z.Z do-not-fragment size 1473
PING y.y.y.y (y.y.y.y): 1473 data bytes
ping: sendto: Message too long
ping: sendto: Message too long
ping: sendto: Message too long

 

This packet wont leave your EX device. Juniper by default uses an MTU of 1500 on its interfaces (logical interfaces/units), meaning that the maximum size of a packet that the interface can send is 1500bytes. The packet your are trying to send has this size:

1473B of Data + 8B of ICMP Header + 20B of IP header = 1501B hence exceding the MTU of the sending interface and getting dropped due to the DF bit being set.

 

However the ping test#2 give us more valuable information:

 

{master:0}
ex2200> ping y.y.y.y source Z.Z.Z.Z do-not-fragment size 1472
PING y.y.y.y (y.y.y.y): 1472 data bytes
1480 bytes from y.y.y.y: icmp_seq=1 ttl=63 time=30.230 ms
1480 bytes from y.y.y.y: icmp_seq=2 ttl=63 time=29.576 ms
1480 bytes from y.y.y.y: icmp_seq=3 ttl=63 time=28.392 ms
..
...
....
--- y.y.y.y ping statistics ---
100 packets transmitted, 83 packets received, 17% packet loss
round-trip min/avg/max/stddev = 28.392/41.628/82.196/15.626 ms
{master:0}

 

First thing to note is a packet loss of 17% of the traffic, which I believe could be the source of your slowness. The amount of data being sent (1472B) accounts for a final IP packet size of 1500B which is the standard MTU size (no problem) but after being encrypted its size will increase. This increase on the packet size can generate fragmentation on future hops/routers carrying those encrypted packets to the remote end of the VPN, where they will be decrypted and the packet will recover their normal size of 1500B. If you take a packet capture on the remote SRX's external interface you could confirm if you are receiving fragmented ESP  packets.

 

Besides the fact that processing fragments is CPU intesive, if a fragments is lost then the whole original packet will have to be retransmitted and thus generating more fragmentation. This is why it is a best practice to avoid it.

 

Please mark this comment as the Solution if applicable
SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎06-14-2019 12:06 AM

I also would like to clarify some points:

 

1) "Note that the SRX MTU includes Ethernet switching header whereas other devices may only calculate it without Ethernet header and hence have a lower number."

 

This mentioned statement is not entirely true and I would like to avoid any confusions. Juniper handles two type of MTU values:

 

Protocol MTU (layer 3): this is the maximum size of an IP packet that can be sent/received on a logical interface/unit. Default value: 1500Bytes
Interface MTU (layer 2): this is the maximum size of an Ethernet frame that can be sent/received on a physical interface. Default value: 1514Bytes (IP packet + 14bytes of Ethernet header):

 

user@host> show interfaces fe-0/2/1 extensive
Physical interface: fe-0/2/0, Enabled, Physical link is Up
Interface index: 129, SNMP ifIndex: 23, Generation: 130
Link-level type: Ethernet, MTU: 1514, Speed: 100mbps, Loopback: Disabled,
Source filtering: Disabled, Flow control: Enabled
.
.
.
Logical interface fe-0/2/0.0 (Index 66) (SNMP ifIndex 46) (Generation 133)
Flags: SNMP-Traps Encapsulation: ENET2
Protocol inet, MTU: 1500, Generation: 142, Route table: 0
Flags: DCU, SCU-out


Fragmentation happens at Layer 3, the IP header is the header with the fields used for fragmentation; because of this we care about the MTU at layer 3: 1500B by default. We need to make sure that the packets wont exceed 1500B in size else the sending interface will be fragmenting them.

 

Regarding your questions:

 

+Why do we need to capture the packet on EX that is directly connected to srx. Why not on the SRX.
R/ This is not needed, as stated the pcap is needed on the remote SRX to determine if we are receiving fragmented esp packets.

 

+How can be easly captured the packet on Ex or srx?
R/not needed

 

+3rd party ISP has mpls, how can we get about ISP's mss value?
R/ MSS is a TCP concept (the amount of data that can be carried on a TCP segment). Before the data reaches the MPLS cloud it has to be encapsulated on TCP, then IP, then esp, the IP again. MSS is a concept relevant on the sending host side, where we need to lower it if we want to end up with smaller packets when they reach the MPLS cloud, where they will be encapsulated in MPLS hence ending up bigger in size.


+After the ISP, how can we verify packet size, and mss size when packet arrives the other end -SRX3400 on the datacentre (End-to-End mss value verification for ipsec traffic.)
R/ You can take a pcap on the external interface of the SRXbranch1 and there we will be able to see the size of the packet. Then we could sum up 4bytes of the MPLS header being added by your ISP. Im not sure what you meant with SRX3400, I thought that the remote SRX was a SRX650 as per the topology.

 

Please mark this comment as the Solution if applicable
SRX Services Gateway

Re: Why traffic is very slow over ipsec

[ Edited ]
‎06-14-2019 12:43 AM

Regarding:

 

Branch site:

>show security flow statistics
Current sessions: 877
Packets forwarded: 805455
Packets dropped: 18626
Fragment packets: 26961

 

The fragments counter only means that IP fragments were received at the flow module but dont necessarily mean that those fragments came over the tunnel,

 

In junos 15.1X49-90 two new fields were included to that output: Pre fragments  and Post fragments . This is because when the SRX is about to send packets over  a VPN it can fragment the packets prior encapsulating them (because of a low MTU value on the st0 interface) or fragment the packets after encrypting them (based on the physical interface MTU). Please check  motd's reply on the following post:

 

https://forums.juniper.net/t5/SRX-Services-Gateway/VPN-Fragmentation/td-p/87642

 

Please share the MTU configured on your st0 interfaces, if its somthing like ~9000, then it is more likely that the packets are fragmented after being encrypted, in which case the pcap for confirming fragmented ESP packet will help you. Actually the traces tell you that pre-fragmentation wasnt needed:

 

Jun 12 14:10:11 14:10:11.054473:CID-0:RT:pre-frag not needed: ipsize: 783, mtu: 9188, nsp2->pmtu: 9188

Jun 12 14:10:11 14:10:11.085986:CID-0:RT:pre-frag not needed: ipsize: 844, mtu: 1422, nsp2->pmtu: 1422

 

Again the fragmentation can be happening after the packets are encrypted. If you lower the mss to 1350 you have smaller packets being encrypted hence smaller encrypted packets and less chances that they will be fragmented.

 

+And what is the impact when playing mss value start point of 1328 during the business hours?

R/ MSS is dictated during the TCP 3-way-handshake, hence it will only affect new TCP connections being negotiated over the tunnel.

 

I really hope this information helps you.

 

Please mark this comment as the Solution if applicable
SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎06-18-2019 03:46 AM

Hi stwardlp,

Thanks for your replies. I have read your posts. I will review again and get back to you. There is some interesting tips you pinpointed. I need some time to deal with it....

Much appreciated....

 

Thanks

SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎06-23-2019 09:09 AM

Hi Arix,

 

Here are two interesting documents, you might want to look at them as well for df bit and fragmentation issue on traffic over VPN.

 

https://rtodto.net/ipsec-tcp-mss-df-bit-and-fragmentation-in-srx/

https://kb.juniper.net/InfoCenter/index?page=content&id=KB25625&cat=OBSOLETE&actp=LIST

 

Thanks

Mahesh

SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎07-05-2019 05:21 AM

hi all,

 

Can anyone explain about when sending icmp via ping  throught Ipsec tunnel, what final packet size will be?

There is only the following statement. -Set security flow tcp-mss all-tcp-mss 1460

The st0 sits on the external pysical interface -vdsl that protocol MTU is 1500

 

srx345>ping 10.10.10.10 source 20.20.20.1 ---->When pinging, what is the final packet size?

 

1460B+20B int IP+20B int TCP header+38B ESP+20B exIP+20B icmp IP+8B icmp header? Is this correct or?

 

 

SRX Services Gateway

Re: Why traffic is very slow over ipsec

‎07-06-2019 08:17 PM

Any idea about my previous post?