SRX Services Gateway
Showing results for 
Search instead for 
Do you mean 
Reply
Trusted Expert
Posts: 784
Registered: ‎11-01-2007
0

Re: SRX650 10.1R1.8... just stops

"Only problem with this release so far has been the DNS alg, which has been blocking DNS replys with CNAME pointing to the base domain address (IE www.asp.net CNAME asp.net, etc.) so I disabled the DNS alg."

 

@BenR

 

I think we give you a nerd-knob in 10.1R2 to tweak the default DNS response packet size.

10.1R2 release notes includes the following:

 

DNS doctoring support—This feature is supported on all SRX Series and J Series devices.

Domain Name System (DNS) ALG functionality has been extended to support static NAT. You should configure static NAT for the DNS server first. Then if the DNS ALG is enabled, public-to-private and private-to-public static address translation can occur for A-records in DNS replies.

The DNS ALG also now includes a maximum-message-length command option with a value range of 512 to 8192 bytes and a default value of 512 bytes. The DNS ALG will now drop traffic if the DNS message length exceeds the configured maximum, if the domain name is more than 255 bytes, or if the label length is more than 63 bytes. The ALG will also decompress domain name compression pointers and retrieve their related full domain names, and check for the existence of compression pointer loops and drop the traffic if one exists.

Note that the DNS ALG can translate the first 32 A-records in a single DNS reply. A-records after the first 32 will not be handled. Also note that the DNS ALG supports only IPv4 addresses and does not support VPN tunnels.

Highlighted
Visitor
Posts: 4
Registered: ‎05-04-2010

Re: SRX650 10.1R1.8... just stops

As a followup to my previous post, our SRX 240 has been running fine since I did two things:

 

1) Upgraded to 10.0R3

2) Added a firewall rule to block incoming multicast traffic (because of what Oldtimer mentioned about multicast traffic traversing multiple interfaces)

 

It's been chugging along since the 9th. Unfortunately I can't say whether the multicast block or the upgrade fixed the problem, and I am loathe to disturb a 'stable' system to find out...

Trusted Contributor
Posts: 89
Registered: ‎03-18-2010
0

Re: SRX650 10.1R1.8... just stops

@KB_Fan, The DNS issue is PR #527294, where the DNS alg does not parse a compressed DNS response properly. The workaround is to disable the DNS alg and wait for 10.1R3 which will have the fix or get a special build from JTAC. In this case I will be waiting for the tested build.

 

@versello, so far so good. It hasn't locked up yet or cored flowd. I will post back if it happens again.

Contributor
Posts: 60
Registered: ‎12-21-2009
0

Re: SRX650 10.1R1.8... just stops

JTAC informed me 10.0S5.2 is out. URL: https://download.juniper.net/software/junos/regressed/10.0/service/10.0S5.2/junos-srxsme-10.0S5.2-do...

 

My IDP issues may be fixed in this release, but JTAC can't confirm it. I may upgrade my SRX650 later this week. I just applied it to my SRX210 without any problems.

Trusted Contributor
Posts: 330
Registered: ‎01-08-2010
0

Re: SRX650 10.1R1.8... just stops

SRX650 cluster running 10.0R3 + Webfiltering

 

Spent the weekended migrating our ISG to the SRX650s, they STOPPED this morning once some real load was put on them..

 

We have a ticket open with JTEC but rolled back to our ISG in the in term.  Lets just say no one is very happy here today.

Contributor
Posts: 14
Registered: ‎03-23-2010
0

Re: SRX650 10.1R1.8... just stops

[ Edited ]

For what it's worth my SRX has frozen a couple of times on 10.0R3 without any IDP features.  At this point I'm rebooting it on a weekly basis to (hopefully) avoid downtime during operating hours.

Trusted Contributor
Posts: 330
Registered: ‎01-08-2010
0

Re: SRX650 10.1R1.8... just stops

Well our SRX650 cluster went back into production and failed a second time even with JTEC reviewing the config.. This time however we had them on live while it was down..

 

Turns out log rollover was NOT WORKING AT ALL.. The box just filled with logs and then died.. We cleared them, set tighter rollovers and they still exceeded the limits.

 

A work around is to set logging to almost nothing, but at this point it leaves us without traffic logs, which is a big problem.

 

we are running 10.0R3.10 still...

Visitor
Posts: 4
Registered: ‎01-21-2010
0

Re: SRX650 10.1R1.8... just stops

may i know the JTAC case number?

Trusted Contributor
Posts: 89
Registered: ‎03-18-2010
0

Re: SRX650 10.1R1.8... just stops

My SRX240 core dumped flowd again yesterday, no reasonable answer from JTAC yet as to why. The first recommendation was to update to 10.2R1 (which isn't even available yet), both PR# 's referenced by JTAC are supposedly fixed in the release that I am running, and also shouldn't apply to my box because of the configuration, so I don't think they are telling me the truth...

 

@mxk - Not happy with the SRX or JTAC at the moment. Most of the time when I talk to someone in India, I get the feeling that all they want to do is give an excuse to close the case so they will get their statistics up instead of actually solving the problem.  Maybe Juniper management needs to look at how they are managing the support center as well as all the bugs in the SRX software?  It might help get the bugs fixed faster at least, and maybe a little more customer satisfaction. Right now I don't think anyone who has bought an SRX branch series to use as a UTM device will ever buy another Juniper product.

Contributor
Posts: 60
Registered: ‎12-21-2009
0

Re: SRX650 10.1R1.8... just stops

The response I got from JTAC was that the issue was caused by "the synchronization between the flow daemon and IDP daemon" and they are working on scheduling a fix for 10.1R3.

Trusted Contributor
Posts: 330
Registered: ‎01-08-2010
0

Re: SRX650 10.1R1.8... just stops

For a follow up on our issue, we have only 10 rules with "log on close enabled" and rollover set to 2 200k files for testing, it turns out rollup only takes place every 15min on an SRX as a cron job.. with our current log level our logs grow to 30mb by the time the 15 min rollup happens.

 

IE it is impossible to do ANY sort of real traffic logging on an SRX at the moment. our ScreenOS devices could handle 100+ rules with logging without filling their storage and dieing!

Trusted Contributor
Posts: 236
Registered: ‎06-11-2010
0

Re: SRX650 10.1R1.8... just stops


SomeITGuy wrote:

For a follow up on our issue, we have only 10 rules with "log on close enabled" and rollover set to 2 200k files for testing, it turns out rollup only takes place every 15min on an SRX as a cron job.. with our current log level our logs grow to 30mb by the time the 15 min rollup happens.

 

IE it is impossible to do ANY sort of real traffic logging on an SRX at the moment. our ScreenOS devices could handle 100+ rules with logging without filling their storage and dieing!


Yikes!  That would be a good way to DoS a SRX box; especially if denies are being logged.  Time to get working on that syslog server...

Trusted Contributor
Posts: 330
Registered: ‎01-08-2010
0

Re: SRX650 10.1R1.8... just stops

We have an nsm, but the basic config to start logging still produces logs on the box so the problem still exists. Also this also means trying to do a simple temporary debug flow will also kill the box.

We have learned however that the new dual root partitioning allocates over 5x more space for logging so we are looking at repartitioning the boxes, but that means a fairly lengthy outage.
Trusted Contributor
Posts: 330
Registered: ‎01-08-2010
0

Re: SRX650 10.1R1.8... just stops

I am starting to think that no one has an srx in production under any serious load. Our cluster went down again today... It appears that clustering provides no redundancy as it so far has never failed over from one node to the other on it's own. After manually restarting the failed node we found that the utm daemon was not even running onthe other node.

Sent a core dump, hoping they find something. This is rediculus the 650s should be overkill for our environment yet can remain running for more than a few days at a time without issue.
Trusted Contributor
Posts: 236
Registered: ‎06-11-2010
0

Re: SRX650 10.1R1.8... just stops

 


SomeITGuy wrote:
I am starting to think that no one has an srx in production under any serious load. Our cluster went down again today... It appears that clustering provides no redundancy as it so far has never failed over from one node to the other on it's own. After manually restarting the failed node we found that the utm daemon was not even running onthe other node.

Sent a core dump, hoping they find something. This is rediculus the 650s should be overkill for our environment yet can remain running for more than a few days at a time without issue.

Have you considered updating to 10.1R2.8 when/if you transition to dual-partitions?  Although my SRX has a light load it used to freeze every few days on 10.0R3.10 and now on 10.1R2.8 it's been running like a champ for two weeks.

 

Trusted Contributor
Posts: 330
Registered: ‎01-08-2010
0

Re: SRX650 10.1R1.8... just stops

Yes we have, however 10.1r2 only became supported under nsm about 3 days ago.

We are waiting for juniper to look at our dumps to tell us if they think the issue is truly fixed in 10.1r2 because if we take the 20-40 min outage to upgrade and then the issue persists we will have caused even more downtime for nothing.
Trusted Contributor
Posts: 89
Registered: ‎03-18-2010
0

Re: SRX650 10.1R1.8... just stops

At least on 10.1R2 we only get 6 minutes of downtime while flowd restarts instead of having the SRX  need someone to restart it before it works again, which is some improvement. Also I get 2 - 3 weeks between issues now which is better then every few days with 10.0R3 and before. If your having to manually restart your SRX's I would definitely upgrade.

Trusted Contributor
Posts: 236
Registered: ‎06-11-2010
0

Re: SRX650 10.1R1.8... just stops


BenR wrote:

At least on 10.1R2 we only get 6 minutes of downtime while flowd restarts instead of having the SRX  need someone to restart it before it works again, which is some improvement. Also I get 2 - 3 weeks between issues now which is better then every few days with 10.0R3 and before. If your having to manually restart your SRX's I would definitely upgrade.


Was a watchdog added for flowd in 10.1R2.8 or were you not able to successfully restart the daemon in 10.0R3.10?

Trusted Contributor
Posts: 89
Registered: ‎03-18-2010
0

Re: SRX650 10.1R1.8... just stops

I think a bunch of watchdog timers where added. I could restart flowd before if I could get to the console, but it was easier to just hit the power switch by then.

Trusted Contributor
Posts: 236
Registered: ‎06-11-2010
0

Re: SRX650 10.1R1.8... just stops


BenR wrote:

I think a bunch of watchdog timers where added. I could restart flowd before if I could get to the console, but it was easier to just hit the power switch by then.


Hopefully they won't stop at adding watchdog timers for a fix to these issues.  I'll have to keep an eye on my logs to see if it has been rebooting.