monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: monit not catching failed ping test


From: Fant, Andrew (NIH/NIDA) [E]
Subject: Re: monit not catching failed ping test
Date: Fri, 8 Mar 2019 21:00:44 +0000

In the monitrc file, I have:

 

set daemon   120

 

As for the monit -vi output, it has 22 remote host checks in total.  A shortened, anonymized copy of it is:

 

Adding 'allow localhost' -- host resolved to [::ffff:127.0.0.1]

Adding credentials for user 'admin'

Runtime constants:

 Control file       = /etc/monitrc

 Log file           = syslog

 Pid file           = /etc/monit/monit.pid

 Id file            = /etc/monit/monit.id

 State file         = /etc/monit/monit.state

 Debug              = True

 Log                = True

 Use syslog         = True

 Is Daemon          = True

 Use process engine = True

 Limits             = {

                    =   programOutput:     512 B

                    =   sendExpectBuffer:  256 B

                    =   fileContentBuffer: 512 B

                    =   httpContentBuffer: 1 MB

                    =   networkTimeout:    5 s

                    =   programTimeout:    5 m

                    =   stopTimeout:       30 s

                    =   startTimeout:      30 s

                    =   restartTimeout:    30 s

                    = }

 On reboot          = start

 Poll time          = 120 seconds with start delay 0 seconds

 Event queue        = base directory /var/monitor with 1000 slots

 M/Monit(s)         = http://[host1.local]:8080/collector with timeout 5 s with credentials

 Start monit httpd  = True

 httpd bind address = localhost

 httpd portnumber   = 2812

 httpd signature    = Enabled

 httpd auth. style  = Basic Authentication and Host/Net allow list

 

The service list contains the following entries:

 

System Name           = host1

 Monitoring mode      = active

 On reboot            = start

 

Remote Host Name      = host2_ping

 Address              = 192.168.1.2

 Monitoring mode      = active

 On reboot            = start

 Ping                 = if failed [count 3 size 64 with timeout 5 s] then alert

 

-------------------------------------------------------------------------------

 

Hopefully this will be of some use.

 

 

--                                        

Andrew Fant                      |            Systems Administrator

address@hidden       |      Lei Shi Lab , NIH/NIDA/IRP

(443)740-2849                   |

 

From: "address@hidden" <address@hidden>
Reply-To: This is the general mailing list for monit <address@hidden>
Date: Friday, March 8, 2019 at 3:26 PM
To: This is the general mailing list for monit <address@hidden>
Subject: Re: monit not catching failed ping test

 

Hello,

 

monit checks the service in intervals given by the "set daemon <x>" settings. If the interval between checks is long or the check is blocked by some service timeout/action, then the interval can be longer.

 

Please can you check the "set daemon" settings and run monit in debug mode?:

 

1.) stop monit

2.) monit -vI

 

Best regards,

Martin

 



On 8 Mar 2019, at 16:49, Fant, Andrew (NIH/NIDA) [E] <address@hidden> wrote:

 

Good morning.

     I have a small monitoring setup with m/monit 3.7.2, using monit 5.25.2 as the agent.   There are a couple of systems that I cannot install monit on that I still need to be aware of any downtime, so I have added them as ping checks in the monitrc on the host where I installed m/monit.  Yesterday, one of those remote systems went down, but monit and m/monit didn’t report an alert for it and still have its status as OK.  Using anonymized information,  the entry in the monitrc on host1 is:

 

CHECK HOST host2_ping with ADDRESS 192.168.1.2

        IF FAILED ping THEN ALERT

 

And from the command line on host1:

 

host1% monit status host2_ping

Monit 5.25.2 uptime: 48d 19h 8m

 

Remote Host 'host2_ping'

  status                       OK

  monitoring status            Monitored

  monitoring mode              active

  on reboot                    start

  ping response time           -

  data collected               Fri, 08 Mar 2019 10:41:33

 

But:

 

host1% ping host2

PING host2.example.org (192.168.1.2) 56(84) bytes of data.

From host1.example.org (192.168.1.1) icmp_seq=1 Destination Host Unreachable

From host1.example.org (192.168.1.1) icmp_seq=2 Destination Host Unreachable

From host1.example.org (192.168.1.1) icmp_seq=3 Destination Host Unreachable

 

Clearly there is a disconnect between the OS-provided ping utility and what monit is seeing.   I’m sure that it’s probably a simple error in configuration, but I am not seeing what I did wrong.   Can someone please set me on the correct path?

 

Thank you

 

--                                        

Andrew Fant                      |            Systems Administrator

address@hidden       |      Lei Shi Lab , NIH/NIDA/IRP

(443)740-2849                   |

-- 
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]