SYSLOG analysis - help desired

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
Post Reply
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

SYSLOG analysis - help desired

Post by JuergenF »

Dear all,

I need some ideas to do the following
I have a SYSLOG file on a Linux machine that picks up a lot of messages from differnet devices like this

Code: Select all

[size=5]Aug 30 15:19:10 192.168.167.190 2424: Aug 30 15:18:44: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
Aug 30 15:19:13 192.168.167.190 2427: Aug 30 15:18:46: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to down
Aug 30 15:21:10 192.168.167.190 2428: Aug 30 15:20:33: %PM-4-ERR_RECOVER: Attempting to recover from psecure-violation err-disable state on Fa0/3
Aug 30 15:21:15 192.168.167.190 2429: Aug 30 15:20:37: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
Aug 30 15:24:34 192.168.167.190 2431: Aug 30 15:23:40: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
Aug 30 15:24:38 192.168.167.190 2434: Aug 30 15:23:42: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to down
Aug 30 15:26:30 192.168.167.190 2435: Aug 30 15:25:24: %PM-4-ERR_RECOVER: Attempting to recover from psecure-violation err-disable state on Fa0/3
ug 31 09:22:25 195.232.212.11 1300: 	destaddr=194.175.52.150, prot=50, spi=0x30D78D20(819432736), srcaddr=82.154.119.53
Aug 31 07:02:48 dcdw0015.wetter.dematic.de 13727: Aug 31 07:02:47.901: %LINK-3-UPDOWN: Interface FastEthernet4/3, changed state to down
Aug 30 15:26:34 192.168.167.190 2436: Aug 30 15:25:28: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
Aug 30 15:27:18 192.168.167.190 2438: Aug 30 15:26:09: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
Aug 30 15:27:21 192.168.167.190 2441: Aug 30 15:26:11: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to down
Aug 30 15:29:26 192.168.167.190 2442: Aug 30 15:28:05: %PM-4-ERR_RECOVER: Attempting to recover from psecure-violation err-disable state on Fa0/3
Aug 30 15:29:30 192.168.167.190 2443: Aug 30 15:28:09: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
Aug 30 15:29:44 192.168.167.190 2445: Aug 30 15:28:22: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
Aug 30 15:29:47 192.168.167.190 2448: Aug 30 15:28:24: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to down
Aug 30 15:31:56 192.168.167.190 2449: Aug 30 15:30:21: %PM-4-ERR_RECOVER: Attempting to recover from psecure-violation err-disable state on Fa0/3
Aug 30 15:32:00 192.168.167.190 2450: Aug 30 15:30:25: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
Aug 30 15:32:01 192.168.167.190 2452: Aug 30 15:30:27: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
Aug 30 15:32:03 192.168.167.190 2455: Aug 30 15:30:29: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to down
Aug 30 15:33:57 192.168.167.190 2456: Aug 30 15:32:12: %PM-4-ERR_RECOVER: Attempting to recover from psecure-violation err-disable state on Fa0/3
Aug 30 15:34:01 192.168.167.190 2457: Aug 30 15:32:16: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
Aug 30 15:38:46 192.168.167.190 2459: Aug 30 15:36:43: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
Aug 30 15:38:49 192.168.167.190 2461: Aug 30 15:36:45: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to down
Aug 30 15:40:47 192.168.167.190 2462: Aug 30 15:38:43: %PM-4-ERR_RECOVER: Attempting to recover from psecure-violation err-disable state on Fa0/3
Aug 30 15:40:51 192.168.167.190 2463: Aug 30 15:38:47: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
Aug 30 15:41:44 192.168.167.190 2464: Aug 30 15:39:41: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to down
Aug 30 15:42:00 192.168.167.190 2465: Aug 30 15:39:56: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
Aug 30 15:42:09 192.168.167.190 2466: Aug 30 15:40:05: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
Aug 30 15:42:11 192.168.167.190 2468: Aug 30 15:40:07: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to down
Aug 30 15:43:59 192.168.167.190 2469: Aug 30 15:41:55: %PM-4-ERR_RECOVER: Attempting to recover from psecure-violation err-disable state on Fa0/3
Aug 30 15:44:03 192.168.167.190 2470: Aug 30 15:41:59: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
Aug 30 15:44:11 192.168.167.190 2471: Aug 30 15:42:08: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
Aug 30 15:44:14 192.168.167.190 2473: Aug 30 15:42:10: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to down
Aug 30 15:45:58 192.168.167.190 2474: Aug 30 15:43:54: %PM-4-ERR_RECOVER: Attempting to recover from psecure-violation err-disable state on Fa0/3
Aug 30 15:46:02 192.168.167.190 2475: Aug 30 15:43:58: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up[/size]
I'd like to focus on this kind of messages

Code: Select all

Aug 30 15:19:10 192.168.167.190 2424: Aug 30 15:18:44: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
Aug 30 15:46:02 192.168.167.190 2475: Aug 30 15:43:58: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
HM should raise an alert when there is no interface up message after a psecure-violation occured.

In other words for this example:
- If there is only a message for Switch 192.168.167.190 Port Fa0/3 that the port is "in err-disable state", then bad condition.
- If there is a message "Interface FastEthernet0/3, changed state to up" too, then all is OK (interface has recovered)

Keep in mind:
- There are multiple Switches and Interfaces.

Is it better to have a HM Agent on that Linux or can HM do that from the W2K3 Server where it is running.

Any hints or ideas are very welcome

Regards

Juergen
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Re: SYSLOG analysis - help desired

Post by KS-Soft Europe »

JuergenF wrote:In other words for this example:
- If there is only a message for Switch 192.168.167.190 Port Fa0/3 that the port is "in err-disable state", then bad condition.
- If there is a message "Interface FastEthernet0/3, changed state to up" too, then all is OK (interface has recovered)
Well, I think, we have resolved the similar request in the following topic: http://www.ks-soft.net/cgi-bin/phpBB/vi ... 39&start=0
The only difference that you have log on Linux system.
JuergenF wrote:Is it better to have a HM Agent on that Linux or can HM do that from the W2K3 Server where it is running.
I think, the best solution is to write the .sh script, that should do the same as .bat script from foregoing topic and use "Shell Script" test method performed by RMA on Linux. On the other hand, if log file is accessible (using Samba or whatever) from the machine, where HM is running, you may use script or texteventscheck.exe utility, posted in foregoing topic.

Regards,
Max
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

Post by JuergenF »

Following Max hint I came to writing a Shell Script. Here it is and it seems to work for my problem when started directly on the Linux system when I'm logged in.
But I need help to get that running within HostMonitor.
I have installed agent on the Linux machine and I can connect to the agent.
I opend Script Manager and copied a Script. I named the new Script "SYSLOG: ERR_Disable Interfaces"
I inserted the Script as posted here.
I need some help in testing.
What do I have to specify in "Lets try"
- I chose the agent
- Set /var/log/warn as Params and clicked test.
[13:00:36] Agent: wersv090 is going to execute "SYSLOG: ERR_Disable Interfaces" script ...
[13:00:37] Script started, no results received

Code: Select all

#! /bin/awk -f
# Script returns Bad Status when there is no corresponding "Interface UP" message for each "err-disable state" message
# Script examins syslog file with messages from Cisco Switches. Lines look like 
# Aug 30 15:44:11 192.168.167.190 2471: Aug 30 15:42:08: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
# Aug 30 15:46:02 192.168.167.190 2475: Aug 30 15:43:58: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
#
# Aug 31 15:44:11 dcdw0015.wetter.dematic.de 3471: Aug 31 15:42:08.354: %PM-4-ERR_DISABLE: psecure-violation error detected on Gi0/15, putting Gi0/15 in err-disable state
# Aug 31 15:46:02 dcdw0015.wetter.dematic.de 3475: Aug 31 15:43:58.543: %LINK-3-UPDOWN: Interface GigabitEthernet0/15, changed state to up
#
# Possible Output (assumed that the "state to up" lines are missing):
# ScriptRes:Bad:192.168.167.190_Fa0/3 dcdw0015_Gi0/15
#
# ./HostMon.awk /var/log/warn
#
BEGIN { # BEGIN rule is executed once only, before the first input record is read
       # nothing to do
      }
      { # This is done for each Input-Line
        # $0 = represents the whole input record, $1 = 1st Parameter, $2 =2nd Parameter ...
        if (length ($0) > 0)
          {
           if ( (match($0, "192.168.16")) || (match($0, "dcdw")) )             # Messages from my Switches
             {
              if ( (match($0, "state to up")) || (match($0, "ERR_DISABLE")) )  # Interesting Interface changes
                {
                 gsub("  ", " ", $0)                           # eliminate double-spaces
                 if ($9 == "%PM-4-ERR_DISABLE:")               # error disable message
                   {
                    $14 = substr ($14,1,length ($14) -1)       # Extract Interface, get rid of ","
                    if (index ($4, ".") > 4)                   # If Full Qualified Domain Name (dcdw0015.wetter.dematic.de)
                       $4 = substr ($4,1,index ($4, ".") -1)   # Shorten to hostname (dcdw0015)
                    ind = ($4 "_" $14)                         # Create unique Index (dcdw0015_Gi0/15)
                    dev_arr[ind] = "Bad"                       # Set current Interface state "Bad"
#                   print ind, dev_arr[ind]                    # For debugging: Display Values
                   }
                 if ($9 == "%LINK-3-UPDOWN:")                  # This is a "Link state changed to up" message
                   {
                    gsub("FastEthernet", "Fa", $11)            # Shorten Interface name to Fa0/3
                    gsub("GigabitEthernet", "Gi", $11)         # ... or to Gi0/15 
                    $11 = substr ($11,1,length ($11) -1)       # Extract Interface, get rid of ","
                    if (index ($4, ".") > 4)                   # If Full Qualified Domain Name (dcdw0015.wetter.dematic.de)
                       $4 = substr ($4,1,index ($4, ".") -1)   # Shorten to hostname (dcdw0015)
                    ind = ($4 "_" $11)                         # Create unique Index (dcdw0015_Gi0/15)
                    dev_arr[ind] = "Ok"                        # Set current Interface state "Ok"
#                   print ind, dev_arr[ind]                    # For debugging: Display Values
                   }
                }
             }
          }
      }
END   { # an END rule is executed once only, after all the input is read
        StatusString = "Ok"                     # Prepare for ScriptRes
        Reply = ""                              # Prepare for ScriptRes
        for (ind in dev_arr)                    # For all Elements of Array dev_arr
          if (dev_arr[ind] == "Bad")            # If Interface is ERR_DISABLE
            {
             StatusString = "Bad"               # Script has to report Bad Status
             Reply = (Reply ind " ")            # Add Bad Interface to Reply-String
#             print ind, dev_arr[ind]           # For debugging: Display Values
            }
                                # ScriptRes:Bad:192.168.167.190_Fa0/3 dcdw0015_Gi0/15
                                # ScriptRes:Ok:
        print "ScriptRes:" StatusString ":" Reply 
      }
Last edited by JuergenF on Sun Sep 02, 2007 2:18 pm, edited 1 time in total.
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

Post by JuergenF »

It's working now, sorry for that.
I don't know exactly what the problem was, but now it's OK.

Many thanks

Juergen
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

JuergenF wrote:It's working now, sorry for that.
I don't know exactly what the problem was, but now it's OK.
Glad it works. Great job! I think, you should post the script into the "Library" branch. http://www.ks-soft.net/cgi-bin/phpBB/viewforum.php?f=7

Regards,
Max
gusdude
Posts: 1
Joined: Mon Sep 24, 2007 8:00 am

Alternate Method for Syslog Reporting

Post by gusdude »

Even though you have a resolution, I will give a method we used that can be quite flexible.

We have the kiwi syslog deamon (windows) running on a server and we push all syslogs to this server. We utilize the mysql advanced log to save all syslogs to mysql. If we need a new check in HostMonitor, we use a mysql query to give results. This may not work for what you need however it works well as a cheap and dirty threat analyzer.

An example call would be:
SELECT count(*) FROM syslogd where DATE_SUB(NOW(),INTERVAL 10 MINUTE) <= DATE_ADD(syslogd.msgdate, INTERVAL syslogd.msgtime HOUR_SECOND) and msgtext like '%inside%' and msgtext not like '%Access%'

where we are checking all specific syslogs messages for the last 10 minutes. If the query reports back more than X events, we send an alert.

The drawback is that it gets into SQL which you may not have experience/support in.
Post Reply