Junos Space Developer
Highlighted
Junos Space Developer

SPACE logging EPS stats stopped

‎08-10-2017 02:07 PM

Anybody has ever run into this ?   I have a SPACE 14.1R3 with 1 LC managing 6 SRX FWs.

 

Having a problem with SPACE NMP logging stats.  The EPS graph stopped at Aug 7 (3 days ago) and live EPS and log counts all shows "0".   It should be always over 100 EPS.   Other than that, I don't see anything wrong, I still can query/report recent traffic logs OK.

 

I have already rebooted LC node, then SPACE node. no changes.

 

 

 

8 REPLIES 8
Highlighted
Junos Space Developer

Re: SPACE logging EPS stats stopped

‎08-11-2017 05:29 AM

Hi,

 

Please check below:

 

df -kh output in Log Collector node.

grep -i error /var/log/messages

status nwlogdecoder

 

-PL

-PL
If this worked for you please flag my post as an "Accepted Solution" so others can benefit. Kudos are always appreciated!
Highlighted
Junos Space Developer

Re: SPACE logging EPS stats stopped

‎08-11-2017 06:58 AM

[root@NWAPPLIANCE4344 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             4.0G  448M  3.4G  12% /
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/mapper/VolGroup00-usr
                      2.0G  934M  980M  49% /usr
/dev/mapper/VolGroup00-var
                     1008M  178M  780M  19% /var
/dev/mapper/VolGroup00-log
                     1008M  640M  317M  67% /var/log
/dev/mapper/VolGroup00-tmp
                     1008M   34M  924M   4% /tmp
/dev/mapper/VolGroup00-nwhome
                       47G  476M   47G   1% /var/netwitness
/dev/mapper/VolGroup01-decoroot
                      493G  183G  285G  40% /var/netwitness/logdecoder
[root@NWAPPLIANCE4344 ~]#  status nwlogdecoder
nwlogdecoder start/running, process 1654

[root@NWAPPLIANCE4344 ~]# grep -i error /var/log/messages

<snip>

Aug  1 14:38:40 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug  1 14:38:40 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding description for error from Error to Errors
Aug  1 14:38:40 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding max values for error from 0 to 50000
Aug  1 14:38:40 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding flags for error from 0 to 1024
Aug  2 14:39:20 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug  2 14:39:20 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding description for error from Error to Errors
Aug  2 14:39:20 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding max values for error from 0 to 50000
Aug  2 14:39:20 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding flags for error from 0 to 1024
Aug  3 14:39:45 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug  3 14:39:45 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding description for error from Error to Errors
Aug  3 14:39:45 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding max values for error from 0 to 50000
Aug  3 14:39:45 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding flags for error from 0 to 1024
Aug  4 14:40:10 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug  4 14:40:10 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding description for error from Error to Errors
Aug  4 14:40:10 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding max values for error from 0 to 50000
Aug  4 14:40:10 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding flags for error from 0 to 1024
Aug  5 14:40:16 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug  5 14:40:16 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding description for error from Error to Errors
Aug  5 14:40:16 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding max values for error from 0 to 50000
Aug  5 14:40:16 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding flags for error from 0 to 1024
Aug  6 14:40:23 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug  6 14:40:23 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding description for error from Error to Errors
Aug  6 14:40:23 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding max values for error from 0 to 50000
Aug  6 14:40:23 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding flags for error from 0 to 1024
Aug  7 14:40:29 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug  7 14:40:29 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding description for error from Error to Errors
Aug  7 14:40:29 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding max values for error from 0 to 50000
Aug  7 14:40:29 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding flags for error from 0 to 1024
Aug  8 14:40:36 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug  8 14:40:36 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding description for error from Error to Errors
Aug  8 14:40:36 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding max values for error from 0 to 50000
Aug  8 14:40:36 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding flags for error from 0 to 1024
Aug  9 14:40:44 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug  9 14:40:44 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding description for error from Error to Errors
Aug  9 14:40:44 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding max values for error from 0 to 50000
Aug  9 14:40:44 NWAPPLIANCE4344 nw[27003]: [Index] [info] Overriding flags for error from 0 to 1024
Aug 10 14:22:42 NWAPPLIANCE4344 nw[1655]: [Engine] [failure] Failed to connect to SNMP AgentX Master due to error No such file or directory
Aug 10 14:22:43 NWAPPLIANCE4344 nw[1654]: [Network] [info] Log file found an error: Do you want to remove the FILE_NOT_CLOSED_FLAG from the header?. Fixing.
Aug 10 14:22:43 NWAPPLIANCE4344 nw[1654]: [Network] [info] Log file found an error: Do you want to remove the FILE_NOT_CLOSED_FLAG from the header?. Fixing.
Aug 10 14:22:43 NWAPPLIANCE4344 nw[1654]: [Engine] [failure] Failed to connect to SNMP AgentX Master due to error No such file or directory
Aug 10 14:22:44 NWAPPLIANCE4344 nw[1655]: [Appliance] [failure] Failed to read sensor /sys/class/hwmon/hwmon0/device/temp2_input due to error basic_filebuf::underflow error reading the file
Aug 10 14:22:44 NWAPPLIANCE4344 nw[1655]: [Appliance] [failure] Failed to read sensor /sys/class/hwmon/hwmon1/device/temp2_input due to error basic_filebuf::underflow error reading the file
Aug 10 14:22:47 NWAPPLIANCE4344 nw[1654]: [Database] [info] Database file /var/netwitness/logdecoder/sessiondb/session-000000069.nwsdb found an error: Do you want to remove the FILE_NOT_CLOSED_FLAG from the header?.  Fixing.
Aug 10 14:24:10 NWAPPLIANCE4344 nw[1654]: [Database] [info] Database file /var/netwitness/logdecoder/packetdb/packet-000000045.nwpdb found an error: Do you want to remove the FILE_NOT_CLOSED_FLAG from the header?.  Fixing.
Aug 10 14:25:37 NWAPPLIANCE4344 nw[1654]: [Database] [info] Database file /var/netwitness/logdecoder/metadb/meta-000000075.nwmdb found an error: Do you want to remove the FILE_NOT_CLOSED_FLAG from the header?.  Fixing.
Aug 10 14:54:18 NWAPPLIANCE4344 nw[1654]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug 10 14:54:18 NWAPPLIANCE4344 nw[1654]: [Index] [info] Overriding description for error from Error to Errors
Aug 10 14:54:18 NWAPPLIANCE4344 nw[1654]: [Index] [info] Overriding max values for error from 0 to 50000
Aug 10 14:54:18 NWAPPLIANCE4344 nw[1654]: [Index] [info] Overriding flags for error from 0 to 1024
[root@NWAPPLIANCE4344 ~]#

Highlighted
Junos Space Developer

Re: SPACE logging EPS stats stopped

‎08-11-2017 07:06 AM

Try Below:

 

#stop nwlogdecoder

 

rm -rf /var/netwitness/logdecoder/sessiondb/session-000000069.nwsdb

/rm -rf /var/netwitness/logdecoder/packetdb/packet-000000045.nwpdb

rm -rf /var/netwitness/logdecoder/metadb/meta-000000075.nwmdb

 

#start nwlogdecoder

 

Wait for few minutes and check SD GUI Event Viewer.

 

-PL

 

Please mark my solution as accepted if it helped, Kudos are appreciated as well.

-PL
If this worked for you please flag my post as an "Accepted Solution" so others can benefit. Kudos are always appreciated!
Highlighted
Junos Space Developer

Re: SPACE logging EPS stats stopped

‎08-11-2017 08:57 AM

Just completed the steps as below.  however seems no changes,  EPS graph is still same old.

 

#history

<snip>

 1024  stop nwlogdecoder
 1025
 1026  ls /var/netwitness/logdecoder/sessiondb/  -l
 1027  rm -rf /var/netwitness/logdecoder/sessiondb/session-000000069.nwsdb
 1028  ls -l /var/netwitness/logdecoder/packetdb/packet
 1029  ls -l /var/netwitness/logdecoder/packetdb/packet*
 1030  rm -rf /var/netwitness/logdecoder/packetdb/packet-000000045.nwpdb
 1031  ls -l /var/netwitness/logdecoder/metadb/*
 1032  rm -rf /var/netwitness/logdecoder/metadb/meta-000000075.nwmdb
 1033  #start nwlogdecoder
 1034  start nwlogdecoder

Highlighted
Junos Space Developer

Re: SPACE logging EPS stats stopped

‎08-11-2017 09:04 AM

#the process restarted at

[root@NWAPPLIANCE4344 ~]# grep -i start  /var/log/messages

<snip>

Aug 11 15:19:14 NWAPPLIANCE4344 nw[1654]: [Engine] [info] Starting server shutdown
Aug 11 15:26:13 NWAPPLIANCE4344 nw[3942]: [Engine] [info] RSA Security Analytics Engine 10.3.3.2522-4 (May  1 2014) 64 bit Starting
Aug 11 15:26:13 NWAPPLIANCE4344 nw[3942]: [Thread] [info] Starting thread: Engine Stats  id: 3943
Aug 11 15:26:13 NWAPPLIANCE4344 nw[1655]: [Appliance] [info] logdecoder started on port 50002

 

#output after the process restart

[root@NWAPPLIANCE4344 ~]# grep -i error  /var/log/messages

<snip>

Aug 10 14:54:18 NWAPPLIANCE4344 nw[1654]: [Index] [info] Overriding flags for error from 0 to 1024
Aug 11 15:14:31 NWAPPLIANCE4344 nw[1654]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug 11 15:14:31 NWAPPLIANCE4344 nw[1654]: [Index] [info] Overriding description for error from Error to Errors
Aug 11 15:14:31 NWAPPLIANCE4344 nw[1654]: [Index] [info] Overriding max values for error from 0 to 50000
Aug 11 15:14:31 NWAPPLIANCE4344 nw[1654]: [Index] [info] Overriding flags for error from 0 to 1024
Aug 11 15:26:13 NWAPPLIANCE4344 nw[3942]: [Network] [info] Log file found an error: Do you want to remove the FILE_NOT_CLOSED_FLAG from the header?. Fixing.
Aug 11 15:26:13 NWAPPLIANCE4344 nw[3942]: [Network] [info] Log file found an error: Do you want to remove the FILE_NOT_CLOSED_FLAG from the header?. Fixing.
Aug 11 15:26:13 NWAPPLIANCE4344 nw[3942]: [Engine] [failure] Failed to connect to SNMP AgentX Master due to error No such file or directory
Aug 11 15:26:14 NWAPPLIANCE4344 nw[3942]: [Database] [info] Database file /var/netwitness/logdecoder/sessiondb/session-000000076.nwsdb found an error: Do you want to remove the FILE_NOT_CLOSED_FLAG from the header?.  Fixing.
Aug 11 15:26:15 NWAPPLIANCE4344 nw[3942]: [session] [warning] ERROR: Database session is missing objects from 505415914 to 512788584.  The gap exists between object store "/var/netwitness/logdecoder/sessiondb/session-000000068.nwsdb" and "/var/netwitness/logdecoder/sessiondb/session-000000070.nwsdb".
Aug 11 15:27:04 NWAPPLIANCE4344 nw[3942]: [Database] [info] Database file /var/netwitness/logdecoder/metadb/meta-000000079.nwmdb found an error: Do you want to remove the FILE_NOT_CLOSED_FLAG from the header?.  Fixing.
Aug 11 15:29:08 NWAPPLIANCE4344 nw[3942]: [Database] [info] Database file /var/netwitness/logdecoder/packetdb/packet-000000046.nwpdb found an error: Do you want to remove the FILE_NOT_CLOSED_FLAG from the header?.  Fixing.
Aug 11 15:37:23 NWAPPLIANCE4344 nw[3942]: [ObjectStore] [warning] There was an error reading the object at position 207906537195208702, truncating at last good position
Aug 11 15:37:30 NWAPPLIANCE4344 nw[3942]: [session] [warning] ERROR: Database session is missing objects from 505415914 to 512788584.  The gap exists between object store "/var/netwitness/logdecoder/sessiondb/session-000000068.nwsdb" and "/var/netwitness/logdecoder/sessiondb/session-000000070.nwsdb".
Aug 11 15:37:31 NWAPPLIANCE4344 nw[3942]: [Index] [info] Overriding level for error from IndexNone to IndexValues
Aug 11 15:37:31 NWAPPLIANCE4344 nw[3942]: [Index] [info] Overriding description for error from Error to Errors
Aug 11 15:37:31 NWAPPLIANCE4344 nw[3942]: [Index] [info] Overriding max values for error from 0 to 50000
Aug 11 15:37:31 NWAPPLIANCE4344 nw[3942]: [Index] [info] Overriding flags for error from 0 to 1024

Highlighted
Junos Space Developer

Re: SPACE logging EPS stats stopped

‎08-11-2017 09:09 AM

no data after Aug7,  EPS count is 0no data after Aug7, EPS count is 0

Highlighted
Junos Space Developer
Solution
Accepted by topic author fraserchen
‎09-11-2017 08:02 AM

Re: SPACE logging EPS stats stopped

‎08-11-2017 09:11 AM
Looks like log db is corrupted, whatever packet,session and indcies you see in messages log file is corrupted

Follow the same process as above else you will have to clear all db to resolve the issue.
-PL
If this worked for you please flag my post as an "Accepted Solution" so others can benefit. Kudos are always appreciated!
Highlighted
Junos Space Developer

Re: SPACE logging EPS stats stopped

‎09-11-2017 08:15 AM

FYI.

 

Recently I tested SD 17.1R1,  with LC on a 8GB memory VM.

 

At beginning it only manage a small SRX, the report and LC node were good in Green. Soon after I added a bigger firewall at about 150 EPS logging, the LC node slowed down, then 'not reachable', logging statistics graph stopped.  At the end, I found from  VM console error msg like "Out of memory: Kill process 1266 (java)...".  

 

The solution is add 8GB more memory, then LC works. Checked deploy document again, I found I missed that 17.1R1 LC requires 16GB memory.

 

 

CLI output:

root@LOG-COLLECTOR log]# healthcheckOSLC

         --Pre Checks in Progress--

 

 Jingest network check OK

 Jingest process is active

ERROR: Elastic Service not running

ERROR: Check ES Cluster configuration

ERROR: No Index found for current hour