monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Restarting based on load average dangerous?


From: Micah Anderson
Subject: Re: Restarting based on load average dangerous?
Date: Fri, 12 May 2006 13:11:59 -0400
User-agent: Thunderbird 1.5.0.2 (X11/20060501)

Martin Pala wrote:
> Yes, see monit manual ... for example:
> 
> if loadavg(1min) > 25 for 8 times within 10 cycles
>   then exec "/usr/bin/monit apachectl stop"
> else if passed for 20 cycles
>   then exec "/usr/bin/monit apachectl start"

I think you do not mean to put /us/bin/monit in the exec line, right?

I tried this, and it didn't work as I expected. I set it like this:

  if loadavg (1min) > 2 for 2 times within 4 cycles
    then exec "/usr/local/bin/over"
  else if passed for 10 cycles
    then exec "/usr/local/bin/under"


(my cycles are 30 seconds).

The first two cycles that it was over '2' it did as I expected, it
exec'd after the second one. However, it *continued* to exec every 30
seconds. This is exactly what I do not want. I want monit to see the
load is above 'x' within 'x' cycles, and if so, stop the service. Once
it has issued the stop, I want it to monitor the load and once it has
dropped below 'x', start the service again (but only start it if it
previously stopped it):

May 12 13:02:27 black monit[16951]: 'localhost' loadavg(1min) of 14.5
matches resource limit [loadavg(1min)>2.0]
May 12 13:02:51 black monit[16951]: Monit has not changed
May 12 13:02:51 black monit[16951]: 'localhost' loadavg(1min) of 19.8
matches resource limit [loadavg(1min)>2.0]
(here exec was run)
May 12 13:03:21 black monit[16951]: 'localhost' loadavg(1min) of 16.7
matches resource limit [loadavg(1min)>2.0]
(here exec was run)
May 12 13:03:51 black monit[16951]: 'localhost' loadavg(1min) of 12.7
matches resource limit [loadavg(1min)>2.0]
(here exec was run)
May 12 13:04:21 black monit[16951]: 'localhost' loadavg(1min) of 12.4
matches resource limit [loadavg(1min)>2.0]
(here exec was run)

> 
> 
> Martin
> 
> 
> Micah Anderson wrote:
>> If the load on your system goes above a threshold and you know that
>> this is a result of a runaway process that needs to be restarted, will
>> this cause the process to be restarted over and over because the 1
>> minute load average will not drop fast enough to get below the threshold:
>>
>> check system localhost
>>   if loadavg (1min) > 25 then exec "/usr/bin/monit apachectl restart"
>>
>> I'm afraid that the load will climb to 35, monit will see this and
>> apache will be restarted, next cycle monit will see that the load
>> average is 30 (because it is going down), and it will issue a restart
>> *again*, the load will continue to drop, monit will see its now 26 and
>> restart apache a third time, when really there is no load problem as
>> the load delta is dropping.
>>
>> Is there a way to make a load dependency that says, "If load gets
>> above 25, stop this process, once the load drops back down below 10
>> things are probably back to normal, so start the process again."?
>>
>> Thanks,
>> Micah
>>
>>
>>
>>
>>
>> -- 
>> To unsubscribe:
>> http://lists.nongnu.org/mailman/listinfo/monit-general
> 
> 
> 
> 
> -- 
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general





reply via email to

[Prev in Thread] Current Thread [Next in Thread]