test does not log (primary and backup)
-
- Posts: 23
- Joined: Wed Oct 08, 2008 2:13 am
test does not log (primary and backup)
Hello KS Soft team,
We have a problem with some tests which seem to be not logged in primary log (Database) neither in backup log (file).
Example :
; ------- Test #01 -------
Method = ShellScript
;--- Common properties ---
;DestFolder = Root\SURVEILLANCE\BERNER-St_Julien_du_Sault\ORACLE\LMPROD\
RMAgent = BERNER-ST_JULIEN_DU_SAULT.FRNT05
Title = LMPROD temps de reponse
Comment = 236
RelatedURL =
ScheduleMode= OneTestPerDay
ScheduleTime= 08:01:00
Alerts = Recheck
ReverseAlert= No
UnknownIsBad= No
WarningIsBad= No
UseCommonLog= Yes
PrivLogMode = Default
CommLogMode = Default
;--- Test specific properties ---
Script = Scripts Oracle||Windows
Params = "C:\asis\RMA-Win\Scripts\temps_reponse.pl" "LMPROD" "manager"
Timeout = 15
UseMacros = No
This does not produce every day, some day it is working (test is logged correctly in database), some other days it's not (no log in primary log / backup log). HM is running at this moment (some other tests are logged in the same second), and we don't see any error in html system log file.
When we take a look at test info window for this test, the number of total checks is the same that the number of occurences of this test in our database (select count(*) where testid='XXX') so it really seems that this test (and some other tests in the same case) is not executed some days. Is this possible ?
This is pretty annoying for us since we use those data to make reports and statistics.
HM v9.40 running on Windows Server 2008 R2.
Please let me know if you need additional informations.
Thank you for your help
Julien
We have a problem with some tests which seem to be not logged in primary log (Database) neither in backup log (file).
Example :
; ------- Test #01 -------
Method = ShellScript
;--- Common properties ---
;DestFolder = Root\SURVEILLANCE\BERNER-St_Julien_du_Sault\ORACLE\LMPROD\
RMAgent = BERNER-ST_JULIEN_DU_SAULT.FRNT05
Title = LMPROD temps de reponse
Comment = 236
RelatedURL =
ScheduleMode= OneTestPerDay
ScheduleTime= 08:01:00
Alerts = Recheck
ReverseAlert= No
UnknownIsBad= No
WarningIsBad= No
UseCommonLog= Yes
PrivLogMode = Default
CommLogMode = Default
;--- Test specific properties ---
Script = Scripts Oracle||Windows
Params = "C:\asis\RMA-Win\Scripts\temps_reponse.pl" "LMPROD" "manager"
Timeout = 15
UseMacros = No
This does not produce every day, some day it is working (test is logged correctly in database), some other days it's not (no log in primary log / backup log). HM is running at this moment (some other tests are logged in the same second), and we don't see any error in html system log file.
When we take a look at test info window for this test, the number of total checks is the same that the number of occurences of this test in our database (select count(*) where testid='XXX') so it really seems that this test (and some other tests in the same case) is not executed some days. Is this possible ?
This is pretty annoying for us since we use those data to make reports and statistics.
HM v9.40 running on Windows Server 2008 R2.
Please let me know if you need additional informations.
Thank you for your help
Julien
-
- Posts: 2832
- Joined: Tue May 16, 2006 4:41 am
- Contact:
This doesn't look like logging problem. Number of checks == number of records in DB.When we take a look at test info window for this test, the number of total checks is the same that the number of occurences of this test in our database (select count(*) where testid='XXX')
I assume you are using FULL mode Primary & Backup logging?
It's possible if test has beed disabled, paused, or all monitoring has been stopped/paused, etc..so it really seems that this test (and some other tests in the same case) is not executed some days. Is this possible ?
Could you try to setup FULL mode Private log for this test?
-
- Posts: 23
- Joined: Wed Oct 08, 2008 2:13 am
Yes, we're using FULL mode for primary and backup logging.
Test has not been disabled, paused or anything else (otherwise we would see this information in Quick Log window). As i said, monitoring was running correctly since there are other tests logged in the database at the same time (before, after and also at the exact same second).
I have setup FULL private log for this test (in addition to common log). I'll let you know if the problem appears again and if an entry is made in this new logfile.
Test has not been disabled, paused or anything else (otherwise we would see this information in Quick Log window). As i said, monitoring was running correctly since there are other tests logged in the database at the same time (before, after and also at the exact same second).
I have setup FULL private log for this test (in addition to common log). I'll let you know if the problem appears again and if an entry is made in this new logfile.
-
- Posts: 23
- Joined: Wed Oct 08, 2008 2:13 am
I don't think test returns too long reply string since it's configured to return just XXX ms. Also, we can't see any error in database logs.
As already said, there is no error in HostMonitor system log file syslog.htm
In Auditing Tools, there is also no error. However, we can see that HM is running 4.1 tests/s, would it be possible that some test are not done because HM is overloaded ? (even if it's saying Conclusion: system is able to perform given tests without significant load)
Anyway, the problem did not appear this morning, i'll let you know next time it appears and i'll check :
- if there is a corresponding entry in the private log file.
- if the field Last test time in Test info window is OK
As already said, there is no error in HostMonitor system log file syslog.htm
In Auditing Tools, there is also no error. However, we can see that HM is running 4.1 tests/s, would it be possible that some test are not done because HM is overloaded ? (even if it's saying Conclusion: system is able to perform given tests without significant load)
Anyway, the problem did not appear this morning, i'll let you know next time it appears and i'll check :
- if there is a corresponding entry in the private log file.
- if the field Last test time in Test info window is OK
4 tests per sec? overloaded? no way.In Auditing Tools, there is also no error. However, we can see that HM is running 4.1 tests/s, would it be possible that some test are not done because HM is overloaded ? (even if it's saying Conclusion: system is able to perform given tests without significant load)
You may easily check if test was performed, just look at Last test time field.
Also you may setup private log file for this test. If you will see all records in file log and some missed records in database and no errors in system log then we can assume ODBC driver or database does not work correctly. What exactly ODBC driver do you use?
Regards
Alex
-
- Posts: 23
- Joined: Wed Oct 08, 2008 2:13 am
We use "Oracle dans OraClient11g_home1_32bit" driver with 11.02.00.03 version.
The problem appeared again, I send the private logfile :
[08/23/2013 8:02:47] GENTRAN temps de reponse Ok 625 ms Shell Script 64573
[08/24/2013 8:02:58] GENTRAN temps de reponse Ok 391 ms Shell Script 64573
[08/27/2013 8:01:11] GENTRAN temps de reponse Ok 375 ms Shell Script 64573
We can see, 08/25/2013 and 08/26/2013 tests are missing.
There are scheduled every days.
We have the same log in our database.
The problem appeared again, I send the private logfile :
[08/23/2013 8:02:47] GENTRAN temps de reponse Ok 625 ms Shell Script 64573
[08/24/2013 8:02:58] GENTRAN temps de reponse Ok 391 ms Shell Script 64573
[08/27/2013 8:01:11] GENTRAN temps de reponse Ok 375 ms Shell Script 64573
We can see, 08/25/2013 and 08/26/2013 tests are missing.
There are scheduled every days.
We have the same log in our database.
You have scheduled 262 test items to be performed at 08:01
And you set 10 tests per second limit.
Also there are some limitations for simultaneous test execution by agents...
If some scripts require a lot of time, HostMonitor may not be able to perform all tests within specified time frame and do not start them. May be this is the reason of this problem.
Try to schedule different time for some items.
If you need to start test once a day but execution time is not necessary limited to several minutes, there are better solution - you may use regular schedule for test items. Just set long test interval (e.g. 2 hours) and assign schedule with shorter window (e.g. from 8:00 till 9:00). In this case HostMonitor will perform tests once a day within hour between 8:00 and 9:00...
Regards
Alex
And you set 10 tests per second limit.
Also there are some limitations for simultaneous test execution by agents...
If some scripts require a lot of time, HostMonitor may not be able to perform all tests within specified time frame and do not start them. May be this is the reason of this problem.
Try to schedule different time for some items.
If you need to start test once a day but execution time is not necessary limited to several minutes, there are better solution - you may use regular schedule for test items. Just set long test interval (e.g. 2 hours) and assign schedule with shorter window (e.g. from 8:00 till 9:00). In this case HostMonitor will perform tests once a day within hour between 8:00 and 9:00...
Regards
Alex
-
- Posts: 23
- Joined: Wed Oct 08, 2008 2:13 am