correlate entries in (log-)files. Find pairs of messages

correlate entries in (log-)files. Find pairs of messages

Post by JuergenF »

Dear all,

Max encouraged me to post my Script here.
I've got so much support and ideas from this forum, maybe I can give back a bit.

The situation:
I have a SYSLOG file on a Linux machine that picks up a lot of messages from different devices.
HM should raise an alert when there is no "interface up" message after a "port-secure-violation" occured (interface down is performed).

In other words:
- If there is only a message for Switch Port Fa0/3 that the port is "in err-disable state", then raise "Bad" condition.
- If there is a message "Interface FastEthernet0/3, changed state to up" for too, then all is OK (interface has recovered)

Please regard: There are multiple Switches and Interface-Ports
Here is my Script:

Code: Select all

#! /bin/awk -f
# Script returns Bad Status when there is no corresponding "Interface UP" message for each "err-disable state" message
# Script examins syslog file with messages from Cisco Switches. Lines look like 
# Aug 30 15:44:11 2471: Aug 30 15:42:08: %PM-4-ERR_DISABLE: psecure-violation error detected on Fa0/3, putting Fa0/3 in err-disable state
# Aug 30 15:46:02 2475: Aug 30 15:43:58: %LINK-3-UPDOWN: Interface FastEthernet0/3, changed state to up
# Aug 31 15:44:11 3471: Aug 31 15:42:08.354: %PM-4-ERR_DISABLE: psecure-violation error detected on Gi0/15, putting Gi0/15 in err-disable state
# Aug 31 15:46:02 3475: Aug 31 15:43:58.543: %LINK-3-UPDOWN: Interface GigabitEthernet0/15, changed state to up
# Possible Output (assumed that the "state to up" lines are missing):
# ScriptRes:Bad: dcdw0015_Gi0/15
# ./HostMon.awk /var/log/warn
BEGIN { # BEGIN rule is executed once only, before the first input record is read
       # nothing to do
      { # This is done for each Input-Line
        # $0 = represents the whole input record, $1 = 1st Parameter, $2 =2nd Parameter ...
        if (length ($0) > 0)
           if ( (match($0, "192.168.16")) || (match($0, "dcdw")) )             # Messages from my Switches
              if ( (match($0, "state to up")) || (match($0, "ERR_DISABLE")) )  # Interesting Interface changes
                 gsub("  ", " ", $0)                           # eliminate double-spaces
                 if ($9 == "%PM-4-ERR_DISABLE:")               # error disable message
                    $14 = substr ($14,1,length ($14) -1)       # Extract Interface, get rid of ","
                    if (index ($4, ".") > 4)                   # If Full Qualified Domain Name (
                       $4 = substr ($4,1,index ($4, ".") -1)   # Shorten to hostname (dcdw0015)
                    ind = ($4 "_" $14)                         # Create unique Index (dcdw0015_Gi0/15)
                    dev_arr[ind] = "Bad"                       # Set current Interface state "Bad"
#                   print ind, dev_arr[ind]                    # For debugging: Display Values
                 if ($9 == "%LINK-3-UPDOWN:")                  # This is a "Link state changed to up" message
                    gsub("FastEthernet", "Fa", $11)            # Shorten Interface name to Fa0/3
                    gsub("GigabitEthernet", "Gi", $11)         # ... or to Gi0/15 
                    $11 = substr ($11,1,length ($11) -1)       # Extract Interface, get rid of ","
                    if (index ($4, ".") > 4)                   # If Full Qualified Domain Name (
                       $4 = substr ($4,1,index ($4, ".") -1)   # Shorten to hostname (dcdw0015)
                    ind = ($4 "_" $11)                         # Create unique Index (dcdw0015_Gi0/15)
                    dev_arr[ind] = "Ok"                        # Set current Interface state "Ok"
#                   print ind, dev_arr[ind]                    # For debugging: Display Values
END   { # an END rule is executed once only, after all the input is read
        StatusString = "Ok"                     # Prepare for ScriptRes
        Reply = ""                              # Prepare for ScriptRes
        for (ind in dev_arr)                    # For all Elements of Array dev_arr
          if (dev_arr[ind] == "Bad")            # If Interface is ERR_DISABLE
             StatusString = "Bad"               # Script has to report Bad Status
             Reply = (Reply ind " ")            # Add Bad Interface to Reply-String
#             print ind, dev_arr[ind]           # For debugging: Display Values
                                # ScriptRes:Bad: dcdw0015_Gi0/15
                                # ScriptRes:Ok:
        print "ScriptRes:" StatusString ":" Reply 
In addition I use the function "Use warning status" in HostMonitor for the first 3 Bad recurrenes to reduce the number of False-Positives.

Feel free to use or modify the Script to fit your requirements.

Best regards

