SRX

last person joined: 14 hours ago 

Ask questions and share experiences about the SRX Series, vSRX, and cSRX.
  • 1.  Help needed diagnosing SRX reboot

    Posted 11-14-2012 13:38

    Hi,

     

    Our SRX220 started to exhibit strange behavior this afternoon (VPN tunnels dropping, etc..).  I tried to SSH into it, but was unable to.  After about 15 minutes (and a lot of panicing), it came back online and everything is back to normal.  So my questions are:

    1. What log files, if any are available to diagnose this type of thing.  Here is my syslog config (which will be improved):
      archive size 100k files 3;
      user * {
          any emergency;
      }
      file messages {
          any critical;
          authorization info;
      }
      file interactive-commands {
          interactive-commands error;
      }

       

      2. show chassis routing-engine shows:

       Last reboot reason             0x1:power cycle/failure

       what does this mean?  (the device is locked away in a datacenter, so I am very confident that no one power cycled it)



  • 2.  RE: Help needed diagnosing SRX reboot

    Posted 11-15-2012 02:54

    power cycle/failure—Reboot due to the switching off of the power button behind the Routing Engine, not the power button on the chassis.

     

    What does user@srx show system core-dumps reveal?

     

    What does user@srx show log messages reveal?

     

     

    What version of JUNOS are you running?

     



  • 3.  RE: Help needed diagnosing SRX reboot

    Posted 11-15-2012 07:45

    Thanks for you response.

     

     

    "Reboot Due to the swithching.."  I came across this same description, but I am confident no one physically did anythign to the device

     

    1. asdf@fw1> show system core-dumps
      /var/crash/*core*: No such file or directory
      /var/tmp/*core*: No such file or directory
      /var/crash/kernel.*: No such file or directory
      /tftpboot/corefiles/*core*: No such file or directory
      

       

      I am running  JUNOS 10.4R4.5

       

      Things started to go bad around 15:00.  Reboot started around ~15:25.  Back online at ~15:32

       

      show log messages

      Nov 14 14:00:01  fw1 sshd[15292]: Failed password for root from 218.29.228.34 port 51898 ssh2
      Nov 14 14:00:01  fw1 sshd[15293]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:06  fw1 sshd[15294]: Failed password for root from 218.29.228.34 port 54687 ssh2
      Nov 14 14:00:06  fw1 sshd[15298]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:11  fw1 sshd[15299]: Failed password for root from 218.29.228.34 port 57818 ssh2
      Nov 14 14:00:11  fw1 sshd[15300]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:14  fw1 sshd[15301]: Failed password for root from 218.29.228.34 port 60566 ssh2
      Nov 14 14:00:15  fw1 sshd[15302]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:19  fw1 sshd[15303]: Failed password for root from 218.29.228.34 port 35269 ssh2
      Nov 14 14:00:19  fw1 sshd[15304]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:22  fw1 sshd[15305]: Failed password for root from 218.29.228.34 port 38429 ssh2
      Nov 14 14:00:23  fw1 sshd[15306]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:27  fw1 sshd[15307]: Failed password for root from 218.29.228.34 port 40565 ssh2
      Nov 14 14:00:28  fw1 sshd[15308]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:31  fw1 sshd[15309]: Failed password for root from 218.29.228.34 port 43430 ssh2
      Nov 14 14:00:31  fw1 sshd[15310]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:35  fw1 sshd[15311]: Failed password for root from 218.29.228.34 port 45417 ssh2
      Nov 14 14:00:35  fw1 sshd[15312]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:39  fw1 sshd[15313]: Failed password for root from 218.29.228.34 port 47800 ssh2
      Nov 14 14:00:40  fw1 sshd[15314]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:43  fw1 sshd[15315]: Failed password for root from 218.29.228.34 port 50389 ssh2
      Nov 14 14:00:44  fw1 sshd[15316]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:48  fw1 sshd[15317]: Failed password for root from 218.29.228.34 port 52934 ssh2
      Nov 14 14:00:48  fw1 sshd[15318]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:52  fw1 sshd[15319]: Failed password for root from 218.29.228.34 port 55159 ssh2
      Nov 14 14:00:52  fw1 sshd[15320]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:00:55  fw1 sshd[15321]: Failed password for root from 218.29.228.34 port 57770 ssh2
      Nov 14 14:00:56  fw1 sshd[15322]: Received disconnect from 218.29.228.34: 11: Bye Bye
      Nov 14 14:18:53  fw1 login: LOGIN_INFORMATION: User asdf logged in from host xxx on device ttyp0
      Nov 14 14:59:31  fw1 init: can not access /usr/sbin/jdiameterd: No such file or directory
      Nov 14 14:59:31  fw1 init: diameter-service (PID 0) started
      Nov 14 14:59:31  fw1 init: can not access /usr/sbin/ipmid: No such file or directory
      Nov 14 14:59:31  fw1 init: ipmi (PID 0) started
      Nov 14 15:07:02  fw1 init: can not access /usr/sbin/jdiameterd: No such file or directory
      Nov 14 15:07:02  fw1 init: diameter-service (PID 0) started
      Nov 14 15:07:02  fw1 init: can not access /usr/sbin/ipmid: No such file or directory
      Nov 14 15:07:02  fw1 init: ipmi (PID 0) started
      Nov 14 15:07:22  fw1 login: LOGIN_INFORMATION: User asdf logged in from host xxxx on device ttyp0
      Nov 14 15:09:20  fw1 sshd[15838]: fatal: Read from socket failed: Connection reset by peer
      Nov 14 15:10:26  fw1 login: LOGIN_INFORMATION: User asdf logged in from host xxxx on device ttyp0
      Nov 14 15:13:47  fw1 login: LOGIN_INFORMATION: User asdf logged in from host xxxx on device ttyp0
      Nov 14 15:15:51  fw1 login: LOGIN_INFORMATION: User asdf logged in from host xxxx on device ttyp1
      Nov 14 15:19:43  fw1 sshd[16120]: Connection closed by xxxx
      Nov 14 15:19:59  fw1 init: can not access /usr/sbin/jdiameterd: No such file or directory
      Nov 14 15:19:59  fw1 init: diameter-service (PID 0) started
      Nov 14 15:19:59  fw1 init: can not access /usr/sbin/ipmid: No such file or directory
      Nov 14 15:19:59  fw1 init: ipmi (PID 0) started
      Nov 14 15:22:05  fw1 init: can not access /usr/sbin/jdiameterd: No such file or directory
      Nov 14 15:22:05  fw1 init: diameter-service (PID 0) started
      Nov 14 15:22:05  fw1 init: can not access /usr/sbin/ipmid: No such file or directory
      Nov 14 15:22:05  fw1 init: ipmi (PID 0) started
      Nov 14 15:25:44  fw1 init: can not access /usr/sbin/jdiameterd: No such file or directory
      Nov 14 15:25:44  fw1 init: diameter-service (PID 0) started
      Nov 14 15:25:44  fw1 init: can not access /usr/sbin/ipmid: No such file or directory
      Nov 14 15:25:44  fw1 init: ipmi (PID 0) started
      Nov 14 15:30:43  fw1 /kernel: getmemsize: msgbufp[size=32768] = 0x8000cfe4
      Nov 14 15:30:43  fw1 /kernel: Copyright (c) 1996-2011, Juniper Networks, Inc.
      Nov 14 15:30:43  fw1 /kernel: All rights reserved.
      Nov 14 15:30:43  fw1 /kernel: Copyright (c) 1992-2006 The FreeBSD Project.
      Nov 14 15:30:43  fw1 /kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
      Nov 14 15:30:43  fw1 /kernel:     The Regents of the University of California. All rights reserved.
      Nov 14 15:30:43  fw1 /kernel: JUNOS 10.4R4.5 #0: 2011-05-06 06:14:23 UTC
      Nov 14 15:30:43  fw1 /kernel:     builder@warth.juniper.net:/volume/build/junos/10.4/release/10.4R4.5/obj-octeon/bsd/sys/compile/JSRXNLE
      Nov 14 15:30:43  fw1 /kernel: JUNOS 10.4R4.5 #0: 2011-05-06 06:14:23 UTC
      Nov 14 15:30:43  fw1 /kernel:     builder@warth.juniper.net:/volume/build/junos/10.4/release/10.4R4.5/obj-octeon/bsd/sys/compile/JSRXNLE
      Nov 14 15:30:43  fw1 /kernel: real memory  = 1073741824 (1024MB)
      Nov 14 15:30:43  fw1 /kernel: avail memory = 527044608 (502MB)
      Nov 14 15:30:43  fw1 /kernel: cpuid: 0, btlb_cpumap:0xffffffff
      Nov 14 15:30:43  fw1 /kernel: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
      Nov 14 15:30:43  fw1 /kernel: Initializing watchdog interupt
      Nov 14 15:30:43  fw1 /kernel: Loading RT Fifo module.....
      Nov 14 15:30:43  fw1 /kernel: Loaded RT Fifo module
      Nov 14 15:30:43  fw1 /kernel: pmap_helper loaded (interface version 6, syscall 210)
      Nov 14 15:30:43  fw1 /kernel: cpu0 on motherboard
      Nov 14 15:30:43  fw1 /kernel: : CAVIUM's Octeon CPU Rev. 0.1 with no FPU implemented
      Nov 14 15:30:43  fw1 /kernel:         L1 Cache: I size 32kb(128 line), D size 8kb(128 line), sixty four way.
      Nov 14 15:30:43  fw1 /kernel:         L2 Cache: Size 128kb, ? way
      Nov 14 15:30:43  fw1 /kernel: obio0 on motherboard
      Nov 14 15:30:43  fw1 /kernel: uart0: <Octeon-16550 channel 0> on obio0
      Nov 14 15:30:43  fw1 /kernel: uart0: console (9600,n,8,1)
      Nov 14 15:30:43  fw1 /kernel: twsi0 on obio0
      Nov 14 15:30:43  fw1 /kernel: dwc0: <Synopsis DWC OTG Controller Driver> on obio0
      Nov 14 15:30:43  fw1 /kernel: usb0: DWC OTG Controller
      Nov 14 15:30:43  fw1 /kernel: Using DMA mode
      Nov 14 15:30:43  fw1 /kernel: Init: Port Power? op_state=1
      Nov 14 15:30:43  fw1 /kernel: Init: Power Port (0)
      Nov 14 15:30:43  fw1 /kernel: usb0: <USB Bus for DWC OTG Controller> on dwc0
      Nov 14 15:30:43  fw1 /kernel: usb0: USB revision 2.0
      Nov 14 15:30:43  fw1 /kernel: uhub0: vendor 0x0000 DWC OTG root hub, class 9/0, rev 2.00/1.00, addr 1
      Nov 14 15:30:43  fw1 /kernel: uhub0: 1 port with 1 removable, self powered
      Nov 14 15:30:43  fw1 /kernel: uhub1: vendor 0x0409 product 0x005a, class 9/0, rev 2.00/1.00, addr 2
      Nov 14 15:30:43  fw1 /kernel: uhub1: single transaction translator
      Nov 14 15:30:43  fw1 /kernel: uhub1: 3 ports with 2 removable, self powered
      Nov 14 15:30:43  fw1 /kernel: pcib0: <Cavium on-chip PCI bridge> on obio0
      Nov 14 15:30:43  fw1 /kernel: Disabling Octeon big bar support
      Nov 14 15:30:43  fw1 /kernel: PCI Status: PCI 32-bit: 0xc041b
      Nov 14 15:30:43  fw1 /kernel: pcib0: Initialized controller
      Nov 14 15:30:43  fw1 /kernel: pci0: <PCI bus> on pcib0
      Nov 14 15:30:43  fw1 /kernel: pci0: <simple comms> at device 1.0 (no driver attached)
      Nov 14 15:30:43  fw1 /kernel: atapci0: <SiI 0680 UDMA133 controller> port 0x8-0xb,0x10-0x17,0x18-0x1b,0x20-0x2f mem 0x8020000-0x80200ff irq 0 at device 2.0 on pci0
      Nov 14 15:30:43  fw1 /kernel: ata2: <ATA channel 0> on atapci0
      Nov 14 15:30:43  fw1 /kernel: ata3: <ATA channel 1> on atapci0
      Nov 14 15:30:43  fw1 /kernel: cpld0 on obio0
      Nov 14 15:30:43  fw1 /kernel: gblmem0 on obio0
      Nov 14 15:30:43  fw1 /kernel: octpkt0: <Octeon RGMII> on obio0
      Nov 14 15:30:43  fw1 /kernel: cfi0: <AMD/Fujitsu - 8MB> on obio0
      Nov 14 15:30:43  fw1 /kernel: platform_cookie_read not implemented
      Nov 14 15:30:43  fw1 /kernel: Timecounter "mips" frequency 700000000 Hz quality 0
      Nov 14 15:30:43  fw1 /kernel: Timecounters tick every 1.000 msec
      Nov 14 15:30:43  fw1 /kernel: Loading the NETPFE ethernet module
      Nov 14 15:30:43  fw1 /kernel: Loading E1/T1/J1 driver
      Nov 14 15:30:43  fw1 /kernel: Loading the DS1/E1 Media Layer; Attaching to media services layer
      Nov 14 15:30:43  fw1 /kernel: Loading common multilink module.
      Nov 14 15:30:43  fw1 /kernel: Loading the NETPFE PPPoE module
      Nov 14 15:30:43  fw1 /kernel: Loading the netpfe services driver
      Nov 14 15:30:43  fw1 /kernel: Loading the NETPFE docsis module
      Nov 14 15:30:43  fw1 /kernel: Loading DS0 driver
      Nov 14 15:30:43  fw1 /kernel: Loading the DS0 Media Layer; Attaching to media services layer
      Nov 14 15:30:43  fw1 /kernel: Loading the XDSL Media Layer; Attaching to media services layer
      Nov 14 15:30:43  fw1 /kernel: Loading the IPSec driver
      Nov 14 15:30:43  fw1 /kernel:  Loading the PTM driver
      Nov 14 15:30:43  fw1 /kernel: Loading the ISDN driver
      Nov 14 15:30:43  fw1 /kernel: Loading the ISDN BRI Media Layer; Attaching to media services layer
      Nov 14 15:30:43  fw1 /kernel: Loading Link Services PICs module.
      Nov 14 15:30:43  fw1 /kernel: IPsec: Initialized Security Association Processing.
      Nov 14 15:30:43  fw1 /kernel: ad0: Device does not support APM
      Nov 14 15:30:43  fw1 /kernel: ad0: 1006MB <CF 1GB 20080112> at ata2-master WDMA2
      Nov 14 15:30:43  fw1 /kernel: SMP: AP CPU #1 Launched!
      Nov 14 15:30:43  fw1 /kernel: Trying to create bootdev, rootpartition ad0s2a
      Nov 14 15:30:43  fw1 /kernel: Trying to mount root from ufs:/dev/ad0s2a
      Nov 14 15:30:43  fw1 /kernel: WARNING: / was not properly dismounted
      Nov 14 15:30:46  fw1 /kernel: Loading the DIALER driver
      Nov 14 15:30:56  fw1 init: watchdog (PID 1077) started
      Nov 14 15:30:56  fw1 init: bslockd (PID 1078) started
      Nov 14 15:30:56  fw1 init: tnp-process (PID 1079) started
      Nov 14 15:30:56  fw1 init: interface-control (PID 1080) started
      Nov 14 15:30:56  fw1 init: chassis-control (PID 1081) started
      Nov 14 15:30:56  fw1 init: alarm-control (PID 1082) started
      Nov 14 15:30:56  fw1 init: craft-control (PID 1083) started
      Nov 14 15:30:56  fw1 init: ntp (PID 1084) started
      Nov 14 15:30:56  fw1 init: management (PID 1085) started
      Nov 14 15:30:56  fw1 init: snmp (PID 1086) started
      Nov 14 15:30:56  fw1 init: mib-process (PID 1087) started
      Nov 14 15:30:56  fw1 init: routing (PID 1088) started
      Nov 14 15:30:56  fw1 init: l2-learning (PID 1089) started
      Nov 14 15:30:56  fw1 init: inet-process (PID 1090) started
      Nov 14 15:30:56  fw1 init: pfe (PID 1091) started
      Nov 14 15:30:56  fw1 init: remote-operations (PID 1092) started
      Nov 14 15:30:56  fw1 init: class-of-service (PID 1093) started
      Nov 14 15:30:56  fw1 init: ipsec-key-management (PID 1094) started
      Nov 14 15:30:56  fw1 init: periodic-packet-services (PID 1095) started
      Nov 14 15:30:57  fw1 init: firewall (PID 1096) started
      Nov 14 15:30:57  fw1 init: internal-routing-service (PID 1097) started
      Nov 14 15:30:57  fw1 init: neighbor-liveness (PID 1098) started
      Nov 14 15:30:57  fw1 init: forwarding (PID 1099) started
      Nov 14 15:30:57  fw1 init: dhcp (PID 1100) started
      Nov 14 15:30:57  fw1 init: usb-control (PID 856) started
      Nov 14 15:30:57  fw1 init: ppp (PID 1101) started
      Nov 14 15:30:57  fw1 init: event-processing (PID 871) started
      Nov 14 15:30:58  fw1 init: lacp (PID 1102) started
      Nov 14 15:30:58  fw1 init: general-authentication-service (PID 1103) started
      Nov 14 15:30:58  fw1 init: mpls-traceroute (PID 1104) started
      Nov 14 15:30:58  fw1 init: database-replication (PID 1105) started
      Nov 14 15:30:58  fw1 init: secure-neighbor-discovery (PID 1106) started
      Nov 14 15:30:58  fw1 init: wireless-wan-service (PID 1107) started
      Nov 14 15:30:58  fw1 init: relay-process (PID 1108) started
      Nov 14 15:30:58  fw1 init: jsrp-service (PID 1109) started
      Nov 14 15:30:58  fw1 init: network-security (PID 1110) started
      Nov 14 15:30:58  fw1 init: pki-service (PID 1111) started
      Nov 14 15:30:58  fw1 init: web-management (PID 1112) started
      Nov 14 15:30:58  fw1 init: can not access /usr/sbin/jdiameterd: No such file or directory
      Nov 14 15:30:58  fw1 init: diameter-service (PID 0) started
      Nov 14 15:30:59  fw1 init: idp-policy (PID 1113) started
      Nov 14 15:30:59  fw1 init: network-security-trace (PID 1116) started
      Nov 14 15:30:59  fw1 init: firewall-authentication-service (PID 1117) started
      Nov 14 15:30:59  fw1 init: security-log (PID 1118) started
      Nov 14 15:30:59  fw1 init: utmd (PID 1119) started
      Nov 14 15:31:00  fw1 init: simple-mail-client-service (PID 1120) started
      Nov 14 15:31:00  fw1 init: can not access /usr/sbin/ipmid: No such file or directory
      Nov 14 15:31:00  fw1 init: ipmi (PID 0) started
      Nov 14 15:31:00  fw1 init: wireless-lan-service (PID 1121) started
      Nov 14 15:31:00  fw1 init: multicast-snooping (PID 1122) started
      Nov 14 15:31:00  fw1 init: license-service (PID 1123) started
      Nov 14 15:31:01  fw1 init: service-deployment (PID 1127) started
      Nov 14 15:31:01  fw1 init: ethernet-switching (PID 1128) started
      Nov 14 15:31:01  fw1 init: Starting of initial processes complete
      Nov 14 15:31:03  fw1 snmpd[1086]: SNMPD_TRAP_COLD_START: trap_generate_cold: SNMP trap: cold start
      Nov 14 15:31:24  fw1 init: can not access /usr/sbin/jdiameterd: No such file or directory
      Nov 14 15:31:24  fw1 init: diameter-service (PID 0) started
      Nov 14 15:31:24  fw1 init: can not access /usr/sbin/ipmid: No such file or directory
      Nov 14 15:31:24  fw1 init: ipmi (PID 0) started
      Nov 14 15:31:30  fw1 /kernel: RT_PFE: RT msg op 1 (PREFIX ADD) failed, err 5 (Invalid)
      Nov 14 15:31:30  fw1 /kernel: GENCFG: op 2 (Gencfg Blob) failed; err 5 (Invalid)
      Nov 14 15:31:30  fw1 /kernel: GENCFG: op 2 (Gencfg Blob) failed; err 7 (Doesn't Exist)
      Nov 14 15:31:30  fw1 last message repeated 3 times
      Nov 14 15:32:21  fw1 sshd[1150]: Accepted password for asdf from xxxx port 18870 ssh2
      Nov 14 15:38:55  fw1 checklogin[1212]: warning: can't get client address: Bad file descriptor
      Nov 14 15:38:55  fw1 checklogin[1212]: WEB_AUTH_SUCCESS: Authenticated httpd client (username asdf)
      Nov 14 16:00:59  fw1 sshd[1989]: Could not write ident string to UNKNOWN

       

       

       

     



  • 4.  RE: Help needed diagnosing SRX reboot

    Posted 11-15-2012 08:59

    Is the power plugged into the SRX securely, with a cable tie to hold it in place? AC cord pushed all the way into the power wart? Try wiggling each when you can risk an outage. Perhaps a failing power supply - do you have a spare or can you swap with a less critical 220?



  • 5.  RE: Help needed diagnosing SRX reboot
    Best Answer

    Posted 11-15-2012 09:01

    Odd, looks like it just disappeared.   Has anything like this happened before? As said above, is the power cable secure?  was there anyone working near your Cab? There is a code upgrade available to current version JUNOS 11.4R5.5.  It may be worth considering



  • 6.  RE: Help needed diagnosing SRX reboot

    Posted 11-15-2012 09:32

    Unless someone from the datacenter staff (Equinix) took it upon themseleves to go poking around inside the cabinet, no one was near it.

     

    The power cord is secure, but I will certainly confirm the next time I go out there.  The last time anyone was even in the cab was months ago. 

     

    This is the first time this happened.  Prior to this, the system uptime was almost a year I think.

     

    I will do the upgrade the next time I go out there.

     

    Thanks to everyone to taking a look



  • 7.  RE: Help needed diagnosing SRX reboot

    Posted 03-19-2013 20:13

    I just had the same thing happen to my SRX 240 last night.

     

    Running the latest OS: 

    JUNOS Software Release [12.1X44-D10.4]

     

    This device is in a datacentre, no power issues (I have other servers in the rack that were fine).

     

    Hopefully it is a once off!