monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Process fails to restart on newer versions of monit


From: Martin Pala
Subject: Re: Process fails to restart on newer versions of monit
Date: Wed, 13 May 2015 09:07:03 +0200

You will probably need to debug the opsworks-agent start script and/or opsworks-agent logs itself … monit started the script, but it’s not possible to say why it stopped in the middle.

Monit 5.9 and newer logs error output from the program if it failed, so it seems it didn’t return any error.

You can yet try to wrap the script like this to log any output:

 start program = "/bin/bash -c '/usr/sbin/service opsworks-agent start' >/tmp/opsworks-agent.log 2>&1"


Regards,
Martin


On 13 May 2015, at 08:43, Shrinath M <address@hidden> wrote:

Ah! That change in original snippet is due to me trying random things to fix it. Shots in the dark :(

I tried changing the start command to use /usr/sbin/service, though, the opsworks-agent master process starts, but is killed within 2 seconds. 
So my first thought was that the agent itself is buggy, but if I start manually, "service opsworks-agent restart", it runs just fine. 
Also, there is no such problem from monit version 5.3.2. 
Weird. 

Just more observations/info - 
The opsworks-agent when run manually throws some output to screen and says "started child 1/3", "started child 2/3", "started child 3/3" and "started master <pid> with 3 children". 
This also goes to their logs. 
But when started through monit, only the first 3 statements are there in the log, but not the last one saying "started master <pid> with 3 children"


On Wed, May 13, 2015 at 10:45 AM Shrinath M <address@hidden> wrote:
I am using AWS Opsworks and AWS uses an old version of monit (5.3.2) to monitor their agent. Obviously, when their opsworks-agent dies, monit restarts it. 
Recently, I wanted to monitor few processes of my own and required newer versions of monit to use the explicit "restart" command support. I upgraded monit to 5.13. 
Now, monit does not restart opsworks agent if it dies!

I tried looking for changelog of monit to see if something was changed between versions, but could not find them for all versions beyond 5.7. 
Can someone please take a look at opsworks config below and see what might be breaking? 

opsworks-config follows - 
check process opsworks-agent with pidfile "/var/lib/aws/opsworks/pid/opsworks-agent.pid"
  start program = "/etc/init.d/opsworks-agent start"
  stop program = "/etc/init.d/opsworks-agent stop"
  depends on opsworks-agent-master-running
  depends on opsworks-agent-statistic-daemons-log
  depends on opsworks-agent-process-command-daemons-log
  depends on opsworks-agent-keep-alive-daemons-log
  group opsworks

check process opsworks-agent-master-running matching "opsworks-agent:\smaster"
  if not exist for 2 cycles then restart
  group opsworks

# check run of statistic daemon
check file opsworks-agent-statistic-daemons-log with path "/var/log/aws/opsworks/opsworks-agent.statistics.log"
  if timestamp > 2 minutes for 3 cycles then restart
  if does not exist for 3 cycles then restart
  group opsworks

# check run of process command daemon
check file opsworks-agent-process-command-daemons-log with path "/var/log/aws/opsworks/opsworks-agent.process_command.log"
  if timestamp > 2 minutes for 3 cycles then restart
  if does not exist for 3 cycles then restart
  group opsworks

# check run of keep alive deamon
check file opsworks-agent-keep-alive-daemons-log with path "/var/log/aws/opsworks/opsworks-agent.keep_alive.log"
  if timestamp > 2 minutes for 3 cycles then restart
  if does not exist for 3 cycles then restart
  group opsworks

- end of file

Monit logs say restart done, but opsworks doesn't run. If I downgrade to 5.3.2, it does magically run!
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general


reply via email to

[Prev in Thread] Current Thread [Next in Thread]