Martin,
You have it right. It was an assumption on my part about an email
address
existing in our downstream server when in fact it was just an alias
in our
data center relay. So all I had to do was adjust the alert address
(address@hidden) to get my relay to forward appropriately. Purely my
mistake,
however, it would have been nice if there were a more verbose (debug
like)
logging mechanism in Monit such that I could have seen the mail
events and
that would have informed me that Monit was doing its job and at that
point I
would have checked my mail relay. The lack of log information and the
non-delivery of the messages caused me to erroneously assume Monit
simply
wasn't sending the alert emails.
All is good now though and I can see already Monit will be an
indispensable
tool for service alert and restart complementing my Nagios
environment well!
Thanks again,
Bruce
On 1/20/09 12:34 PM, "Martin Pala" <address@hidden> wrote:
Thanks for info.
Monit logs error when the mailserver fails or returns error class >=
400 ... what exactly was the problem in your case? (we can improve
the
error reporting) Since no error was logged by monit it seems that the
message was accepted by mailserver and the MTA dropped the massage
later?
Thanks,
Martin
On Jan 20, 2009, at 7:30 PM, Bruce Reed wrote:
The strace uncovered my problem and it was with a mail alias, so
thanks for
the tip!
It would be nice to have more verbose logging by monit to log email
events.
Had I seen those in the log I would have known it at least sent the
message
and the problem was with the address I had used. On the other hand,
I should
have combed my mailserver logs to see if a message had been received
for the
address I specified.
Bruce
On 1/16/09 12:14 PM, "Martin Pala" <address@hidden> wrote:
Looks strange - i don't remember problem like this and even
changelog
doesn't mention such issue.
It could be good to trace monit to see what happened:
strace -f -o monit.trace monit -vI
The monit.trace file will contain system call traces so we can see
whether it tried to connect to SMTP server and what happened.
On Jan 16, 2009, at 9:05 PM, Bruce Reed wrote:
4.9 rpm from rpmforge
On 1/16/09 11:55 AM, "Martin Pala" <address@hidden> wrote:
The configuration looks OK.
What monit version it is?
Thanks,
Martin
On Jan 16, 2009, at 8:22 PM, Bruce Reed wrote:
Here is the verbose output. Looks like verbose output begins and
ends at
process start up (host/domain names changed):
Starting Process Monitor (monit): monit: Debug: Adding host
allow
'localhost'
monit: Debug: Skipping redundant host 'localhost'
monit: Debug: Skipping redundant host 'localhost'
monit: Debug: Adding credentials for user 'admin'.
Runtime constants:
Control file = /etc/monit.conf
Log file = syslog
Pid file = /var/run/monit.pid
Debug = True
Log = True
Use syslog = True
Is Daemon = True
Use process engine = True
Poll time = 60 seconds
Mail server(s) = prodsmtp.mydomain.net
Mail from = address@hidden
Mail subject = monit alert -- $EVENT $SERVICE
Mail message = $EVENT Service $SERV..(truncated)
Start monit httpd = True
httpd bind address = localhost
httpd portnumber = 2812
httpd signature = True
Use ssl encryption = False
httpd auth. style = Basic Authentication and Host/Net allow
list
Alert mail to = address@hidden
Alert on = All events
The service list contains the following entries:
Process Name = ntpd
Pid file = /var/run/ntpd.pid
Monitoring mode = active
Start program = '/etc/init.d/ntpd start' timeout 1
cycle(s)
Stop program = '/etc/init.d/ntpd stop' timeout 1
cycle(s)
Pid = if changed 1 times within 1 cycle(s) then
alert
Ppid = if changed 1 times within 1 cycle(s) then
alert
Timeout = If 3 restart within 3 cycles then
unmonitor
else if
passed then alert
System Name = test-prod.mydomain.net
Monitoring mode = active
--------------------------------------------------------------------------->
-
---
monit: pidfile '/var/run/monit.pid' does not exist
Starting monit daemon with http interface at [localhost:2812]
Then when ntp is killed I see the following in /var/log/
messages:
Jan 16 19:10:50 test-prod ntpd[13505]: ntpd exiting on signal 15
Jan 16 19:11:32 test-prod monit[2398]: 'ntpd' process is not
running
Jan 16 19:11:32 test-prod monit[2398]: 'ntpd' trying to restart
Jan 16 19:11:32 test-prod monit[2398]: 'ntpd' start: /etc/
init.d/
ntpd
Jan 16 19:11:32 test-prod ntpd[2541]: ntpd address@hidden Tue
Jun 10
00:07:18 UTC 2008 (1)
Jan 16 19:11:32 test-prod ntpd[2542]: precision = 2.000 usec
.
.
There is no additional output from monit and no attempt to send
mail
according to maillog.
On 1/16/09 3:52 AM, "Jan-Henrik Haukeland" <address@hidden>
wrote:
Have you tried to specify which mail server Monit should use
for
alerts?
See
http://mmonit.com/monit/documentation/monit.html#setting_a_mail_server_f
or
_a
le
rt_messages
On 16. jan.. 2009, at 08.00, Bruce Reed wrote:
I’ve just begun using monit and I am having difficulties
getting
monit to send mail. I’m testing using ntpd and it is
restarting
the
process, but not sending mail on service restart events or
timeout.
In monit.conf I have:
set alert address@hidden
I then had a check statement like this:
check process ntpd with pidfile /var/run/ntpd.pid
start program = "/etc/init.d/ntpd start"
stop program = "/etc/init.d/ntpd stop"
if 3 restarts within 3 cycles then timeout
alert address@hidden only on { timeout }
After 3 successive kills of ntpd and restarts by monit, a
timeout
message was logged, but no mail was sent. I tried removing the
alert
statement to see if mail would be sent on any event, but I
only
see
information iogged and no mail is sent. Nothing in /var/log/
maillog
either.
Funny thing is, when I first set this up monit attempted to
send
mail, but an ACL on my postifx server prevented it from
getting
through. I fixed that and retried my test, but from that point
on no
mail was sent. Thought perhaps this was a state caching issue,
but
no change across monit restart and I installed monit on
another
server using the same conf files and I get the same results
there.
Thanks,
Bruce
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general