test does not log (primary and backup)

When you post information about some problem, please include the following details: - OS version (e.g. Windows 2000 Professional SP3); HostMonitor version; problem description.
Post Reply
jbarrellon
Posts: 23
Joined: Wed Oct 08, 2008 2:13 am

test does not log (primary and backup)

Post by jbarrellon »

Hello KS Soft team,

We have a problem with some tests which seem to be not logged in primary log (Database) neither in backup log (file).

Example :

; ------- Test #01 -------

Method = ShellScript
;--- Common properties ---
;DestFolder = Root\SURVEILLANCE\BERNER-St_Julien_du_Sault\ORACLE\LMPROD\
RMAgent = BERNER-ST_JULIEN_DU_SAULT.FRNT05
Title = LMPROD temps de reponse
Comment = 236
RelatedURL =
ScheduleMode= OneTestPerDay
ScheduleTime= 08:01:00
Alerts = Recheck
ReverseAlert= No
UnknownIsBad= No
WarningIsBad= No
UseCommonLog= Yes
PrivLogMode = Default
CommLogMode = Default
;--- Test specific properties ---
Script = Scripts Oracle||Windows
Params = "C:\asis\RMA-Win\Scripts\temps_reponse.pl" "LMPROD" "manager"
Timeout = 15
UseMacros = No


This does not produce every day, some day it is working (test is logged correctly in database), some other days it's not (no log in primary log / backup log). HM is running at this moment (some other tests are logged in the same second), and we don't see any error in html system log file.
When we take a look at test info window for this test, the number of total checks is the same that the number of occurences of this test in our database (select count(*) where testid='XXX') so it really seems that this test (and some other tests in the same case) is not executed some days. Is this possible ?
This is pretty annoying for us since we use those data to make reports and statistics.

HM v9.40 running on Windows Server 2008 R2.

Please let me know if you need additional informations.

Thank you for your help
Julien
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

When we take a look at test info window for this test, the number of total checks is the same that the number of occurences of this test in our database (select count(*) where testid='XXX')
This doesn't look like logging problem. Number of checks == number of records in DB.
I assume you are using FULL mode Primary & Backup logging?
so it really seems that this test (and some other tests in the same case) is not executed some days. Is this possible ?
It's possible if test has beed disabled, paused, or all monitoring has been stopped/paused, etc..
Could you try to setup FULL mode Private log for this test?
jbarrellon
Posts: 23
Joined: Wed Oct 08, 2008 2:13 am

Post by jbarrellon »

Yes, we're using FULL mode for primary and backup logging.

Test has not been disabled, paused or anything else (otherwise we would see this information in Quick Log window). As i said, monitoring was running correctly since there are other tests logged in the database at the same time (before, after and also at the exact same second).

I have setup FULL private log for this test (in addition to common log). I'll let you know if the problem appears again and if an entry is made in this new logfile.
KS-Soft
Posts: 12869
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

May be test returns too long Reply string and database returns some error.
Please check Auditing Tools (men View->Auditing Tool) and system log file (HotMonitor log file, default name syslog.htm) for errors.

Regards
Alex
jbarrellon
Posts: 23
Joined: Wed Oct 08, 2008 2:13 am

Post by jbarrellon »

I don't think test returns too long reply string since it's configured to return just XXX ms. Also, we can't see any error in database logs.

As already said, there is no error in HostMonitor system log file syslog.htm

In Auditing Tools, there is also no error. However, we can see that HM is running 4.1 tests/s, would it be possible that some test are not done because HM is overloaded ? (even if it's saying Conclusion: system is able to perform given tests without significant load)

Anyway, the problem did not appear this morning, i'll let you know next time it appears and i'll check :
- if there is a corresponding entry in the private log file.
- if the field Last test time in Test info window is OK
KS-Soft
Posts: 12869
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

In Auditing Tools, there is also no error. However, we can see that HM is running 4.1 tests/s, would it be possible that some test are not done because HM is overloaded ? (even if it's saying Conclusion: system is able to perform given tests without significant load)
4 tests per sec? overloaded? no way.
You may easily check if test was performed, just look at Last test time field.
Also you may setup private log file for this test. If you will see all records in file log and some missed records in database and no errors in system log then we can assume ODBC driver or database does not work correctly. What exactly ODBC driver do you use?

Regards
Alex
jbarrellon
Posts: 23
Joined: Wed Oct 08, 2008 2:13 am

Post by jbarrellon »

We use "Oracle dans OraClient11g_home1_32bit" driver with 11.02.00.03 version.

The problem appeared again, I send the private logfile :
[08/23/2013 8:02:47] GENTRAN temps de reponse Ok 625 ms Shell Script 64573
[08/24/2013 8:02:58] GENTRAN temps de reponse Ok 391 ms Shell Script 64573
[08/27/2013 8:01:11] GENTRAN temps de reponse Ok 375 ms Shell Script 64573

We can see, 08/25/2013 and 08/26/2013 tests are missing.
There are scheduled every days.
We have the same log in our database.
KS-Soft
Posts: 12869
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Could you please send your configuration files to support@ks-soft.net?
We need HML file with tests + *.LST files + *.INI files (you may skip connlist.lst with passwords)

Regards
Alex
KS-Soft
Posts: 12869
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

You have scheduled 262 test items to be performed at 08:01
And you set 10 tests per second limit.
Also there are some limitations for simultaneous test execution by agents...
If some scripts require a lot of time, HostMonitor may not be able to perform all tests within specified time frame and do not start them. May be this is the reason of this problem.

Try to schedule different time for some items.
If you need to start test once a day but execution time is not necessary limited to several minutes, there are better solution - you may use regular schedule for test items. Just set long test interval (e.g. 2 hours) and assign schedule with shorter window (e.g. from 8:00 till 9:00). In this case HostMonitor will perform tests once a day within hour between 8:00 and 9:00...

Regards
Alex
jbarrellon
Posts: 23
Joined: Wed Oct 08, 2008 2:13 am

Post by jbarrellon »

We have followed your advice. For now the problem didn't appear again.

Thank you for your help.
KS-Soft
Posts: 12869
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

You are welcome :)

Regards
Alex
Post Reply