Today we have encountered an interesting problem; the SRX3400 (Software Version: 12.1X46-D25.7) device has stopped all traffic going through its all ports.
We could not understand why it happened as symtopms were below;
- We can ping the Juniper SRX from internal network
- We cannot ping Juniper SRX from DMZ (which we should have)
- We cannot reach Juniper using SSH and Web Management, only Console is working
- We cannot ping local devices from other local devices connected to Juniper SRX of different ports
- When trying to connect by SSH it does somehow accept the connection but hangs for a while and then connection drops
- We can ping or reach to any device connected to Juniper from Juniper SRX device
- The uplink interface was UP however we cannot ping peering IP
- When we look at routing-engine it says; 0.01 Load with %50 memory usage and everything is OK
- There are no alarms in chassis
- There are no alarms in system
- There are no changes in config
- We restart the machine and the problem is gone (!)
- The system was up for 750 days
I am suspecting a hardware failure but I am not sure about it.
What do you think the problem is ?
As I investigate I found thousands of "SIP ALG decode packet error" coming from the same IP address. And when I search it on google, I found this KB: https://prsearch.juniper.net/InfoCenter/index?page=prcontent&id=PR1193679
I believe this caused a DoS.
Since you mentioned that console access was available, did you happen to check the RE CPU during the issue (kernel/user % and not load average)?
Also did you check if any particular process was running high CPU?