The problem is most probably caused by the "echo $$ > ${PIDFILE}" in the run script - it is most probably executed before the fork of controling terminal which executes the program => instead of the PID of the started service the pidfile contains PID of the process which was driving the execution. Monit takes the PID to check from the pidfile - if two pidfiles contain the same PID, it will monitor the same process twice (the service name in monit configuration file is just descriptive name).
If you have services which do not generate the pidfile themselves and/or cannot make the script to create the pidfile, you can use the pattern based process check like this (no need for pidfile):
check process foobar matching "foobar"
Regards, Martin
On Oct 21, 2011, at 6:25 PM, Christopher Johnston wrote: Running into a weird problem on a number of my production systems, we end up having two different applications started up and monit records them as having the same PID so in turn its reporting that both apps are up. Obviously only one app can have that PID but the problem is monit is reporting that the application is actually up when it is not. The only way I have been able to resolve it is to echo some junk chars into the pid file so the app starts or stop/start the app. Is this a known bug? I am running 5.2.5 at the moment and tempted to upgrade.
The way I right out my pidfile is pretty straightforward and this happens directly in the run script. # drop pid into pidfile echo $$ > ${PIDFILE}
Process 'F17' status running monitoring status monitored pid 32317 parent pid 32315 uptime 15h 35m children 0 memory kilobytes 871828 memory kilobytes total 871828 memory percent 3.5% memory percent total 3.5% cpu percent 0.0% cpu percent total 0.0% data collected Fri Oct 21 09:11:36 2011
|
|
Process 'F15' status running monitoring status monitored pid 32317 parent pid 32315 uptime 15h 35m children 0 memory kilobytes 871828 memory kilobytes total 871828 memory percent 3.5% memory percent total 3.5% cpu percent 0.0% cpu percent total 0.0% data collected Fri Oct 21 09:11:36 2011
|
-- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
|