SRX Services Gateway
SRX Services Gateway

SRX550 High Memory strange issue

‎01-18-2019 01:22 AM

Hi All!

I have a strange issue with SRX550 High Memory.

It is connected to 2 ISPs with BGP (full view filtered /24).

After some time as BGP sessions come UP the log show:

Jan 17 22:53:06 srx550-1 fto_new: failed to allocate fto
Jan 17 22:53:06 srx550-1 RT: IPv4:0 - 205.253/16 (RT: Failed to allocate object for flow)
Jan 17 22:53:06 srx550-1 RT-HAL,rt_entry_add_msg_proc,3405: rt_halp_vectors->rt_create failed
Jan 17 22:53:06 srx550-1 RT-HAL,rt_entry_add_msg_proc,3466: proto ipv4,len 16 prefix 205.253/16 nh 1342
Jan 17 22:53:06 srx550-1 RT-HAL,rt_msg_handler,688: route process failed
Jan 17 22:53:06 srx550-1 fto_new: failed to allocate fto
Jan 17 22:53:06 srx550-1 RT: IPv4:0 - 49.40.33/24 (RT: Failed to allocate object for flow)
Jan 17 22:53:06 srx550-1 RT-HAL,rt_entry_add_msg_proc,3405: rt_halp_vectors->rt_create failed
Jan 17 22:53:06 srx550-1 RT-HAL,rt_entry_add_msg_proc,3466: proto ipv4,len 24 prefix 49.40.33/24 nh 1342
Jan 17 22:53:06 srx550-1 RT-HAL,rt_msg_handler,688: route process failed

 

The commands "show chassis routing-engine", "show security flow" does not returns any suspisious info.

I tried to filter routes with /23 - still the same.

Moreover, another SRX of the same model with same config works like a charm.

I assumed that it is smth wrong with DRAM, so I took DRAM modules from "working" SRX and put them to the "bad" one, but it did not help.

I will really appreciate any help, thank you!

 

19 REPLIES 19
SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-18-2019 02:49 AM

The memory issue may be because you are exceeding the maximum bgp route table size on the srx550 which is only 712k routes.  With dual full tables you will have a very large rib and fib that likely is over this limit.

 

https://www.juniper.net/us/en/local/pdf/datasheets/1000281-en.pdf

 

Steve Puluka BSEET - Juniper Ambassador
IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
http://puluka.com/home
SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-18-2019 02:57 AM

Thx, but that is not the reason, as:

1. another device of the same model works with the same configuration

2. reducing route size to /23 (that's less than 712k) makes no difference

SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-18-2019 11:34 PM

Hello Dmytro,

 

Can you share a "show route summary" command from both SRXs?

 

Pura Vida from Costa Rica - Mark as Resolved if it applies.
Kudos are appreciated too!
SRX Services Gateway

Re: SRX550 High Memory strange issue

[ Edited ]
‎01-19-2019 06:03 AM

Hi there!

That's how it looks like on a "good" working srx:

inet.0: 739887 destinations, 1463164 routes (735990 active, 0 holddown, 5753 hidden)
Direct: 4 routes, 4 active
Local: 4 routes, 4 active
BGP: 1463152 routes, 735978 active
Static: 3 routes, 3 active
Aggregate: 1 routes, 1 active

 

Unfortunatelly, I can't provide you the same info for the "bad" one as it is not connected to any upstream right now.

But, even when I reduce the number of routes for about twice (smth like 300-400K) I still see the following messages in log:

 /kernel: RT_PFE: RT msg op 2 (PREFIX DELETE) failed, err 5 (Invalid)
 RT-HAL,rt_msg_handler,673: route check failed 22
last message repeated 1257 times
 /kernel: RT_PFE: RT msg op 2 (PREFIX DELETE) failed, err 5 (Invalid)
/kernel: last message repeated 1255 times
 /kernel: RT_PFE: RT msg op 2 (PREFIX DELETE) failed, err 5 (Invalid)
 RT-HAL,rt_msg_handler,673: route check failed 22
 last message repeated 1167 times
 /kernel: RT_PFE: RT msg op 2 (PREFIX DELETE) failed, err 5 (Invalid)
 /kernel: last message repeated 1181 times
 /kernel: RT_PFE: RT msg op 2 (PREFIX DELETE) failed, err 5 (Invalid)
and don't see messages like (RT: Failed to allocate object for flow)

 

SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-19-2019 07:50 AM

I think the first step will be to identify the process(es) responsible.

Confirm the overal memory usage status during the issue.

show chassis routing-engine

 

Look for the processes associated with high memory

show system processes extensive

 

If this is not conclusive you can drop to the shell and use top interactively

start shell

top -H

 

Once you know the responsible daemons we can figure out what functions are responsible.

 

Steve Puluka BSEET - Juniper Ambassador
IP Architect - DQE Communications Pittsburgh, PA (Metro Ethernet & ISP)
http://puluka.com/home
SRX Services Gateway

Re: SRX550 High Memory strange issue

[ Edited ]
‎01-19-2019 07:28 PM

Hello Dmytro,

 

Yes, definitely looks like a capacity issue. Have you tried bringing up one ISP at a time?

 

Can you get the following command outputs whenever you get a chance to connect the device back in the problem state?

> request pfe execute target fwdd command "show usp fto stats"

> request pfe execute target fwdd command "show route summary"

> show route summary

> show system memory

> show version

> show chassis hardware

 

Regards,

 

Vikas

JTAC - CFTS

SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-20-2019 06:16 AM

Hello there,

Do You have Netflow sampling configured on the SRX550 that has memory allocation failures?

HTH

Thx
Alex

_____________________________________________________________________

Please ask Your Juniper account team about Juniper Professional Services offerings.
Juniper PS can design, test & build the network/part of the network as per Your requirements

+++++++++++++++++++++++++++++++++++++++++++++

Accept as Solution = cool !
Accept as Solution+Kudo = You are a Star !
SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-21-2019 01:12 PM

Hi All!

Sorry for delay, but I can perform testing only late in the evening.

So, regarding netflow the answer is no.

Regarding all debugging info - attached.

24.txt - 2 ISPs connected accepting routes upto /24

then I changed route filter to /23

23.txt - 2 ISPs connected accepting routes upto /23

then I disconnected 1 ISP

1isp23.txt - 1 ISPs connected accepting routes upto /23

 

show version

Model: srx550m
Junos: 15.1X49-D110.4
JUNOS Software Release [15.1X49-D110.4]


show chasis hardware
Hardware inventory:
Item Version Part number Serial number Description
Chassis xxxxxxxxxxxxx SRX550M
Midplane REV 10 750-063950 ACPW0869
Routing Engine REV 06 711-062269 ACPT0458 RE-SRX550M
FPC 0 FPC
PIC 0 6x GE, 4x GE SFP Base PIC
Power Supply 0 Rev 04 740-024283 BF95913 PS 645W AC

 

with 1 isp connected I don't see errors for flows, but still see these errors:

 /kernel: RT_PFE: RT msg op 2 (PREFIX DELETE) failed, err 5 (Invalid)
 /kernel: last message repeated 4 times
 /kernel: RT_PFE: RT msg op 2 (PREFIX DELETE) failed, err 5 (Invalid)
 RT-HAL,rt_msg_handler,673: route check failed 22

Attachments

SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-21-2019 07:00 PM

Hello,

 

Thanks for sharing the outputs.

> There are FTO allocation failures and I see them to have increased.

FTO_NEW_FAIL 190943 191045

FTO_ROUTE_UPDATE_CB_MEM 190943 191045

 

> The case with 2 X ISP receiving with filtered /24 active routes are over 712K which is over whats mentioned in the datasheet

 

> In all three cases, it is a bit strange that there is a big difference between the two "show route summary" outputs, the from the RE and one from the fwdd

 

> Its possible that these messages may be normal on the SRX550 considering the size of the Internet routing table > Perhaps you are not seeing the issue on the other SRX simply because of the logging level. Is the logging level the same? (show configuration system syslog)

 

> I would also compare the "show usp fto stats" on the working SRX to see if there are similar number of failures

 

Regards,

 

Vikas

SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-22-2019 11:18 AM

Hi!

Thx for you reply.

The log level is absolutely the same, more over they have identical configuration.

On working srx "show usp fto stats" retrurns:

================ master ================
SENT: Ukern command: show usp fto stats

Tesla/USP Flow Tracking Object Statistics; currently 736246 FTOs
FTO_ALLOCATES 1595556
FTO_DESTROYS 859310
FTO_FLOW_ADDS 32912650
FTO_FLOW_ADD_OVERWRITE_FTO 0
FTO_FLOW_ADD_FREEING_FTO 0
FTO_FLOW_DELS 32909787
FTO_FLOW_DEL_ALREADY 0
FTO_ROUTE_UPDATES 2379477
FTO_ROUTE_OIF_NOT_CHANGED 0
FTO_ROUTE_DELETES 859310
FTO_ROUTE_DELETE_NULL 0
FTO_NH_WORD_GETS 0
FTO_MORE_SPEC_ROUTE 1595549
FTO_ROUTE_RESET_CB_FTO_ALLOC 0
FTO_ROUTE_UPDATE_CB_FTO_ALLOC 1595556

FTO errors:
FTO_GET_1ST_BAD_PARAM 0
FTO_NEW_FAIL 0
FTO_DESTROY_FAIL 0
FTO_ROUTE_UPDATE_BAD_PARAM 0
FTO_ROUTE_UPDATE_NO_NH 0
FTO_FLOW_ADD_BAD_PARAM 0
FTO_FLOW_DEL_BAD_PARAM 0
FTO_ROUTE_RESET_CB_MEM 0
FTO_ROUTE_UPDATE_CB_MEM 0

Route lookup statistics:
Route table read lock not acquired 1508
rtt_spinlock acquired 2454873

L2FTO:
FTO_MAC_RESET_CB_FTO_ALLOC 0
FTO_MAC_UPDATE_CB_FTO_ALLOC 0
FTO_MAC_UPDATE_BAD_PARAM 0
FTO_MAC_UPDATE_NO_NH 0
FTO_MAC_RESET_CB_MEM 0
FTO_MAC_UPDATE_CB_MEM 0

MCAST FTO:
FTO_MCAST_ROUTE_CHANGE 0
FTO_PIM_ROUTE_CHANGE 0
FTO_MCAST_ROUTE_NOT_READY 0
FTO_ROUTE_UPDATE_MCAST 0
FTO_ROUTE_UPDATE_MCAST_RESET_PARENT 0
FTO_ROUTE_UPDATE_MCAST_CHANGED 0
FTO_ROUTE_UPDATE_MCAST_NOT_CHANGED 0
FTO_ROUTE_UPDATE_MCAST_NO_IFL 14
FTO_ROUTE_MCAST_FANOUT_INIT 0
FTO_ROUTE_MCAST_LIST_ALLOC_FAIL 0
FTO_ROUTE_MCAST_NO_IFL 0
FTO_ROUTE_MCAST_FANOUT_NEW_FTO 14
FTO_ROUTE_MCAST_FANOUT_OLD_FTO 0
FTO_ROUTE_MCAST_FANOUT_REINIT 0
FTO_ROUTE_MCAST_FTRPC_FAIL 0
FTO_ROUTE_MCAST_SKIP_NOROUTE_SESSION 0

As you can see all FTO error counters have zero value.

 

SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-22-2019 11:03 PM

Hi,

 

Can you get the following command outputs from the working and not working SRX?

 

> request pfe execute target fwdd command "show arena"

> request pfe execute target fwdd command "show heap"

> request pfe execute target fwdd command "show services mum"

> request pfe execute target fwdd command "show memory"

 

Pura Vida from Costa Rica - Mark as Resolved if it applies.
Kudos are appreciated too!
SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-23-2019 12:17 PM

Hi!

Please, find requested info attached.

Attachments

SRX Services Gateway

Re: SRX550 High Memory strange issue

[ Edited ]
‎01-23-2019 12:47 PM

Dmytro,

 

Thanks for the info. Could you reboot the non-working SRX and gather "request pfe execute target fwdd command "show services mum" right after the bootup and some minutes after it? I am trying to confirm if there is a memory problem:

 

Faulty SRX:

 

request pfe execute target fwdd command "show services mum"
================ master ================
SENT: Ukern command: show services mum

Memory usage manager:  gsm
Total free space to start with: 127898960
Active customers: 1
Max customers: 12
Yellow zone limit: 31974740
Orange zone limit: 23021812
Red zone limit: 11510906
Operational zone:      Red

cust_id      in use       limit
-------------------------------
      0    44153889    63949480
      1         512    63949480

-------------------------------

actual free space =      95048
est. free space =        95016

 

Working SRX:

 

request pfe execute target fwdd command "show services mum"
================ master ================
SENT: Ukern command: show services mum

Memory usage manager:  gsm
Total free space to start with: 127898960
Active customers: 1
Max customers: 12
Yellow zone limit: 31974740
Orange zone limit: 23021812
Red zone limit: 11510906
Operational zone:      Red

cust_id      in use       limit
-------------------------------
      0    58731113    63949480
      1         512    63949480

-------------------------------

actual free space =    5695552
est. free space =      5695552

 

Pura Vida from Costa Rica - Mark as Resolved if it applies.
Kudos are appreciated too!
SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-24-2019 05:19 AM

Hi!

After being idle for about 15 hours it shows:

 

actual free space =   31977208
est. free space =     31977208

and just after reboot:

================ master ================
SENT: Ukern command: show services mum

Memory usage manager:  gsm
Total free space to start with: 127898960
Active customers: 1
Max customers: 12
Yellow zone limit: 31974740
Orange zone limit: 23021812
Red zone limit: 11510906
Operational zone:      Green

cust_id      in use       limit
-------------------------------
      0     2771085    63949480
      1         512    63949480

-------------------------------

actual free space =   43659768
est. free space =     43659768

Proactive reclaim              : ENABLED
Seconds between reclaims       : 4
Number of proactive reclaims   : 0

 

SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-24-2019 12:51 PM

Dmytro,

 

Can you gather the following information on the non-working SRX? make sure that the problem is happening and that the BGP routes showing on "show route summary" are less than 712k routes (any number higher than that is not recommended for SRX550 as shown in the datasheet. I know the working SRX has twice that amount of BGP routes but it doesnt mean that it is a recommended implementation):

 

> show log messages (to confirm the issue)
> show route summary
> show chassis routing-engine
> show security monitoring fpc 0
> request pfe execute target fwdd command "show services mum"
> request pfe execute target fwdd command "show arena"
> request pfe execute target fwdd command "show heap"
> request pfe execute target fwdd command "show mbuf host"
> request pfe execute target fwdd command "show services objcache"
> request pfe execute target fwdd command "show usp fto stats"

 

If possible gather the same commands on the working SRX for comparisson purposes.

 

If you leave the non-working SRX running for around 1 day(connected to both ISPs) will it get stable?

 

Pura Vida from Costa Rica - Mark as Resolved if it applies.
Kudos are appreciated too!
SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-25-2019 12:55 PM

Hi!

Don't you think that limit for 712k is for active routes only?

The problem is that I can't reduce the total number of routes as I'm getting full
view table from both ISPs.

So, filtering routes on my side makes effect for active routes only, not the total number of them.

Anyway, I collected requested info from both SRXs (please see attached files).

For non-working SRX /22 filter is currently applied for incoming BGP prefixes.
For working SRX /24 filter is applied.

With /22 filter for non-working SRX there are no errors in messages log.

I will leave non-working SRX connected to ISPs and will monitor the logs on weekend.

If errors come back I will collect the same set of info once again.

 

P.S. Actually I have 2 branches connected with SRX550 the same way (2 ISPs, full view /24 etc.) and 3 SRX550  routers.

2 routers works like a charm and only one has these problems.

Attachments

SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-29-2019 10:32 PM

Dmytro,

 

The limit is not for the active routes but for the BGP routes database, so with 1.4 million BGP routes you are exceeding the capacity of the SRX and you could expect problems:

 

 show route summary
Autonomous system number: xxxxx
Router ID: x.x.x.x

inet.0: 740559 destinations, 1464998 routes (241210 active, 0 holddown, 985427 hidden)
              Direct:      4 routes,      4 active
               Local:      4 routes,      4 active
                 BGP: 1464986 routes, 241198 active
              Static:      3 routes,      3 active
           Aggregate:      1 routes,      1 active
 request pfe execute target fwdd command "show services mum"
================ master ================
SENT: Ukern command: show services mum

Memory usage manager:  gsm
Total free space to start with: 127898960
Active customers: 1
Max customers: 12
Yellow zone limit: 31974740
Orange zone limit: 23021812
Red zone limit: 11510906
Operational zone:      Yellow
request pfe execute target fwdd command "show services objcache"
================ master ================
SENT: Ukern command: show services objcache

                                            objs    objs
                      obj   obj     objs  in cpu      in     total       total
obj cache name       size align   in use  caches   depot   objects       bytes
------------------------------------------------------------------------------
ADVPN Trigger Pool     88     4        0       0       0         0           0
ALG PST NAT BINDING POOL 16     4        0       0       0         0           0
Client Group Name      72     4        0       0       0         0           0
DIP IN pool            76     4        0       0       0         0           0
FTO pool               76     4        241261      19       0    241280    18337280

 

Maybe this specifc SRX has extra firewall features enabled or is processing more traffic, that will consume more memory than the others firewalls thus making that the exceed on the BGP routes affects it more?  How has the SRX behave when you left it for several days with the problem, does it eventually gets stable?

 

Note that this problem could eventually trigger high CPU utilization as well. The numbers mentioned in the datasheet are those which have been calculated when running BGP alone on the device and nothing else. 

 

It looks like you might need to work with your ISP to reduce the subnets length to try and keep the routes' database size smaller than the limit specified. If all you need is routing with multiple BGP routers sending the complete Internet routing table then I suggest looking at Juniper’s routing platforms (like MX series) which are better suited for such requirements.

 

I dont like to provide bad news but Juniper will not support scenarios where the limits specified are exceeded.

 

Pura Vida from Costa Rica - Mark as Resolved if it applies.
Kudos are appreciated too!
SRX Services Gateway

Re: SRX550 High Memory strange issue

‎01-30-2019 02:28 AM

Hi!

Thx for the update.

I can't keep it in unstable state as I can't leave staff without access to the Internet Smiley Sad

So, I will leave it with /22 filter and keep fingers crossed.

Btw, "working" SRX with /24 filter was in red zone and got to the green as soon as I changed filter to /22.

Anyway, I really appriciate the time you spent and the info you provided, it was very useful for me.

Thank you very much!

 

SRX Services Gateway
Solution
Accepted by topic author Dmytro Vartanian
‎01-31-2019 12:19 PM

Re: SRX550 High Memory strange issue

‎01-30-2019 06:16 AM
Dmytro,

You are very welcome, I am glad that the information was helpful. Please mark the post as Resolved if it applies.
Pura Vida from Costa Rica - Mark as Resolved if it applies.
Kudos are appreciated too!