I am testing an SMTP server on a remote system (SMTP test) via a VPN with an agent. Everything seems in order:
- The interval is set to 2 minutes.
- The timeout is 60 sec
- And the test has a master "ping" test.
- And the "ping" test has a master test on the VPN connection (ping on the receiving interface on the remote router)
My problem is that the test regularly becomes unknown: maximum Alive duration is around 30~45 minutes, sometimes better sometimes worse, then a few bad, so HM displays RMA: cannot read data, so I only get an 87% alive%. The ping test is usualy very good (<50ms) during that time.
I have decreased the Email alert to 5 bad (it was 3 before) to reduce the false alert mails, also, I removed "Treat Unknown status as Bad" because the test seems to loose credibility as you will understand.
Could it be that HM has problems reading the Agent? If not, is my interval too tight? Or is the Agent overloaded? Do I have to set the test differently?
Would it be possible that the agents would keep let's say 1 hour of results if it does not get any reqests from HM? Then a reboot of the HM server would practically be seemless in the tests continuity, same for an RMA connexion loss.
I could send you logs if it helps (and if you have time to read these!)
While waiting for your answer, we will update and reboot the server during the maintenance window tonight after midnight, maybe it will help.
Again, thank you for your time,