monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue with: monit restart [service-name]


From: Martin Pala
Subject: Re: Issue with: monit restart [service-name]
Date: Wed, 11 Jul 2012 22:55:09 +0200

Hello,

if the restart action is called, monit does stop->start as you noted. The stop 
action waits for the process to stop … if the check is pidfile based, it will 
wait for the process which is described by the pidfile to stop or the pidfile 
removal. If you use the pattern based check, it doesn't depend on the pidfile 
at all. The start action is called only on the case that the stop finished. In 
your case it seems that the pidfile is removed at the beginning of your stop 
script … this breaks the stop wait check. You can either fix the stop script or 
use the pattern based check so it will be independent of the pidfile.


Regards,
Martin



On Jul 10, 2012, at 8:54 PM, Harlan Barnes wrote:

> Hello,
> 
> I am trying to setup monit (5.4) to watch my Wowza Media Server.
> 
> I can execute the following commands with no problem:
> 
> monit start wowza
> monit stop wowza
> 
> But when I try this:
> 
> monit restart wowza
> 
> it appears that wowza executes the stop script ... and then without
> waiting for it to return from the stop command, it executes the start
> command. This produces a race condition where the stop action removes
> the pid file that the start action just put down. As such, monit
> thinks the start action failed and does the start again. (Which causes
> other problems.)
> 
> Is that the way it is supposed to work? Is there anything I can do to
> force monit to wait for the stop action to return before executing the
> start action when I tell it to restart the service?
> 
> Thanks,
> 
> Harlan
> 
> ---
> My main config looks like this:
> 
> # set daemon mode timeout to 1 minute
> set daemon 60
> 
> # set the state file to
> set statefile /var/lib/monit/monit.state
> 
> # http support
> set httpd port 2812 and use the address localhost
>    allow localhost
> 
> # Include all files from /etc/monit.d/
> include /etc/monit.d/*
> 
> My wowza config looks like this:
> 
> check process wowza with pidfile /var/run/WowzaMediaServer.pid
>    start program = "/etc/init.d/WowzaMediaServer start"
>    stop program = "/etc/init.d/WowzaMediaServer stop"
>    if failed port 443 then restart
>    if failed host localhost port 443 type TCPSSL
>        protocol HTTP request "/crossdomain.xml" then restart
> 
> Wowza says this on a verbose startup:
> 
> Runtime constants:
> Control file       = /etc/monitrc
> Log file           = /var/log/monit
> Pid file           = /var/run/monit.pid
> Id file            = /root/.monit.id
> Debug              = True
> Log                = True
> Use syslog         = False
> Is Daemon          = True
> Use process engine = True
> Poll time          = 60 seconds with start delay 0 seconds
> Expect buffer      = 256 bytes
> Mail from          = (not defined)
> Mail subject       = (not defined)
> Mail message       = (not defined)
> Start monit httpd  = True
> httpd bind address = localhost
> httpd portnumber   = 2812
> httpd signature    = True
> Use ssl encryption = False
> httpd auth. style  = Host/Net allow list
> 
> The service list contains the following entries:
> 
> Process Name          = wowza
> Pid file             = /var/run/WowzaMediaServer.pid
> Monitoring mode      = active
> Start program        = '/etc/init.d/WowzaMediaServer start' timeout
> 30 second(s)
> Stop program         = '/etc/init.d/WowzaMediaServer stop' timeout 30 
> second(s)
> Existence            = if does not exist 1 times within 1 cycle(s)
> then restart else if succeeded 1 times within 1 cycle(s) then alert
> Pid                  = if changed 1 times within 1 cycle(s) then alert
> Ppid                 = if changed 1 times within 1 cycle(s) then alert
> Port                 = if failed [localhost:443/crossdomain.xml [HTTP
> via TCPSSL] with timeout 5 seconds and retry 0 time(s)] 1 times within
> 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s)
> then alert
> Port                 = if failed [localhost:443 [DEFAULT via TCP]
> with timeout 5 seconds and retry 0 time(s)] 1 times within 1 cycle(s)
> then restart else if succeeded 1 times within 1 cycle(s) then alert
> 
> System Name           = system_webcam_v5a
> Monitoring mode      = active
> 
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general




reply via email to

[Prev in Thread] Current Thread [Next in Thread]