monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Bug or feature ?


From: Jan-Henrik Haukeland
Subject: RE: Bug or feature ?
Date: Tue, 24 Sep 2002 03:52:15 +0200

> > As you can see from the log, the start program is executed but fails,
> > > [EEST Sep 23 13:05:01] Stop: (game) /path/game/kill
> > > [EEST Sep 23 13:05:06] Sendmail: error receiving data from the
> mailserver 'smtp.xxxx.xxx' -- Resource temporarily unavailable
> > > [EEST Sep 23 13:05:06] Could not execute /path/game/kill
>
> I interpreted this in  a way that running kill script is part of the stop
> process with alert and since alert fails, stop script wont be executed
> either.

The stop script should be executed even if alert fails; When monit execute a
start or stop program it does so in two steps: 1) It prints e.g. "Stop:
/the/program" and raise an alert message 2) then fork off a new process that
does the exec of the program. 1 and 2 is not directly releated (because of the
fork). The log message "Could not execute /path/game/kill" is from the child
process that should execute the start/stop program and this simply means that
the exec failed for some reason.

> Because it seems that this not a single problem but when this
> happens, mail alert also fails.. What lead me to this is that kill script
> actually works most of the time.
>
> Ill going to do some stracing next (this week) to see what really happens..

So the kill script works normally from within monit? (For instance can you stop
the server from the monit web interface?) I also noticed that the process uses
lots of cpu and maybe the kill script fails for some reason because of a high
load, this may also be the reason the alert fails? Just a thought, if the Java
server in question manage lots of socket connections the file descriptor table
can get full. This could be the reason the alert fails since it need to open a
socket descriptor to the smtp server, likewise if the stop program uses sockets
or other types of descriptors it could also be the reason why this program
fails as well. I seem to recall that you can use the ulimit command to increase
the amount of descriptors available for a process and if possible you should
reduce the number of connections the Java server can take simultaneously, for
instance 254.

Just my 2 cents :-)

Jan-Henrik





reply via email to

[Prev in Thread] Current Thread [Next in Thread]