monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Alerts not being triggered


From: Bruce Reed
Subject: Re: [monit] Alerts not being triggered
Date: Tue, 20 Jan 2009 10:30:16 -0800
User-agent: Microsoft-Entourage/12.11.0.080522

The strace uncovered my problem and it was with a mail alias, so thanks for
the tip! 

It would be nice to have more verbose logging by monit to log email events.
Had I seen those in the log I would have known it at least sent the message
and the problem was with the address I had used. On the other hand, I should
have combed my mailserver logs to see if a message had been received for the
address I specified.

Bruce


On 1/16/09 12:14 PM, "Martin Pala" <address@hidden> wrote:

> Looks strange - i don't remember problem like this and even changelog
> doesn't mention such issue.
> 
> It could be good to trace monit to see what happened:
> 
> strace -f -o monit.trace monit -vI
> 
> 
> The monit.trace file will contain system call traces so we can see
> whether it tried to connect to SMTP server and what happened.
> 
> 
> 
> 
> On Jan 16, 2009, at 9:05 PM, Bruce Reed wrote:
> 
>> 4.9 rpm from rpmforge
>> 
>> 
>> On 1/16/09 11:55 AM, "Martin Pala" <address@hidden> wrote:
>> 
>>> The configuration looks OK.
>>> 
>>> What monit version it is?
>>> 
>>> Thanks,
>>> Martin
>>> 
>>> 
>>> On Jan 16, 2009, at 8:22 PM, Bruce Reed wrote:
>>> 
>>>> Here is the verbose output. Looks like verbose output begins and
>>>> ends at
>>>> process start up (host/domain names changed):
>>>> 
>>>> Starting Process Monitor (monit): monit: Debug: Adding host allow
>>>> 'localhost'
>>>> monit: Debug: Skipping redundant host 'localhost'
>>>> monit: Debug: Skipping redundant host 'localhost'
>>>> monit: Debug: Adding credentials for user 'admin'.
>>>> Runtime constants:
>>>> Control file       = /etc/monit.conf
>>>> Log file           = syslog
>>>> Pid file           = /var/run/monit.pid
>>>> Debug              = True
>>>> Log                = True
>>>> Use syslog         = True
>>>> Is Daemon          = True
>>>> Use process engine = True
>>>> Poll time          = 60 seconds
>>>> Mail server(s)     = prodsmtp.mydomain.net
>>>> Mail from          = address@hidden
>>>> Mail subject       = monit alert --  $EVENT $SERVICE
>>>> Mail message       = $EVENT Service $SERV..(truncated)
>>>> Start monit httpd  = True
>>>> httpd bind address = localhost
>>>> httpd portnumber   = 2812
>>>> httpd signature    = True
>>>> Use ssl encryption = False
>>>> httpd auth. style  = Basic Authentication and Host/Net allow list
>>>> Alert mail to      = address@hidden
>>>>  Alert on         = All events
>>>> 
>>>> The service list contains the following entries:
>>>> 
>>>> Process Name          = ntpd
>>>> Pid file             = /var/run/ntpd.pid
>>>> Monitoring mode      = active
>>>> Start program        = '/etc/init.d/ntpd start' timeout 1 cycle(s)
>>>> Stop program         = '/etc/init.d/ntpd stop' timeout 1 cycle(s)
>>>> Pid                  = if changed 1 times within 1 cycle(s) then
>>>> alert
>>>> Ppid                 = if changed 1 times within 1 cycle(s) then
>>>> alert
>>>> Timeout              = If 3 restart within 3 cycles then unmonitor
>>>> else if
>>>> passed then alert
>>>> 
>>>> System Name           = test-prod.mydomain.net
>>>> Monitoring mode      = active
>>>> 
>>>> 
--------------------------------------------------------------------------->>>>
-
>>>> ---
>>>> monit: pidfile '/var/run/monit.pid' does not exist
>>>> Starting monit daemon with http interface at [localhost:2812]
>>>> 
>>>> 
>>>> Then when ntp is killed I see the following in /var/log/messages:
>>>> 
>>>> Jan 16 19:10:50 test-prod ntpd[13505]: ntpd exiting on signal 15
>>>> Jan 16 19:11:32 test-prod monit[2398]: 'ntpd' process is not running
>>>> Jan 16 19:11:32 test-prod monit[2398]: 'ntpd' trying to restart
>>>> Jan 16 19:11:32 test-prod monit[2398]: 'ntpd' start: /etc/init.d/
>>>> ntpd
>>>> Jan 16 19:11:32 test-prod ntpd[2541]: ntpd address@hidden Tue
>>>> Jun 10
>>>> 00:07:18 UTC 2008 (1)
>>>> Jan 16 19:11:32 test-prod ntpd[2542]: precision = 2.000 usec
>>>> .
>>>> .
>>>> 
>>>> There is no additional output from monit and no attempt to send mail
>>>> according to maillog.
>>>> 
>>>> On 1/16/09 3:52 AM, "Jan-Henrik Haukeland" <address@hidden>
>>>> wrote:
>>>> 
>>>>> Have you tried to specify which mail server Monit should use for
>>>>> alerts?
>>>>> 
>>>>> See
>>>>> http://mmonit.com/monit/documentation/monit.html#setting_a_mail_server_for
>>>>> _a
>>>>> le
>>>>> rt_messages
>>>>> 
>>>>> 
>>>>> 
>>>>> On 16. jan.. 2009, at 08.00, Bruce Reed wrote:
>>>>> 
>>>>>> I¹ve just begun using monit and I am having difficulties getting
>>>>>> monit to send mail. I¹m testing using ntpd and it is restarting
>>>>>> the
>>>>>> process, but not sending mail on service restart events or
>>>>>> timeout.
>>>>>> In monit.conf I have:
>>>>>> 
>>>>>> set alert address@hidden
>>>>>> 
>>>>>> I then had a check statement like this:
>>>>>> 
>>>>>> check process ntpd with pidfile /var/run/ntpd.pid
>>>>>>   start program = "/etc/init.d/ntpd start"
>>>>>>   stop program  = "/etc/init.d/ntpd stop"
>>>>>>   if 3 restarts within 3 cycles then timeout
>>>>>>   alert address@hidden only on { timeout }
>>>>>> 
>>>>>> After 3 successive kills of ntpd and restarts by monit, a timeout
>>>>>> message was logged, but no mail was sent. I tried removing the
>>>>>> alert
>>>>>> statement to see if mail would be sent on any event, but I only
>>>>>> see
>>>>>> information iogged and no mail is sent. Nothing in /var/log/
>>>>>> maillog
>>>>>> either.
>>>>>> 
>>>>>> Funny thing is, when I first set this up monit attempted to send
>>>>>> mail, but an ACL on my postifx server prevented it from getting
>>>>>> through. I fixed that and retried my test, but from that point
>>>>>> on no
>>>>>> mail was sent. Thought perhaps this was a state caching issue, but
>>>>>> no change across monit restart and I installed monit on another
>>>>>> server using the same conf files and I get the same results there.
>>>>>> 
>>>>>> Thanks,
>>>>>> Bruce
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> To unsubscribe:
>>>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>>> 
>>>> 
>>>> 
>>>> --
>>>> To unsubscribe:
>>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>> 
>>> 
>>> 
>>> --
>>> To unsubscribe:
>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>> 
>> 
>> 
>> --
>> To unsubscribe:
>> http://lists.nongnu.org/mailman/listinfo/monit-general
> 
> 
> 
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general





reply via email to

[Prev in Thread] Current Thread [Next in Thread]