[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Restarting based on load average dangerous?
From: |
Micah Anderson |
Subject: |
Re: Restarting based on load average dangerous? |
Date: |
Fri, 12 May 2006 13:11:59 -0400 |
User-agent: |
Thunderbird 1.5.0.2 (X11/20060501) |
Martin Pala wrote:
> Yes, see monit manual ... for example:
>
> if loadavg(1min) > 25 for 8 times within 10 cycles
> then exec "/usr/bin/monit apachectl stop"
> else if passed for 20 cycles
> then exec "/usr/bin/monit apachectl start"
I think you do not mean to put /us/bin/monit in the exec line, right?
I tried this, and it didn't work as I expected. I set it like this:
if loadavg (1min) > 2 for 2 times within 4 cycles
then exec "/usr/local/bin/over"
else if passed for 10 cycles
then exec "/usr/local/bin/under"
(my cycles are 30 seconds).
The first two cycles that it was over '2' it did as I expected, it
exec'd after the second one. However, it *continued* to exec every 30
seconds. This is exactly what I do not want. I want monit to see the
load is above 'x' within 'x' cycles, and if so, stop the service. Once
it has issued the stop, I want it to monitor the load and once it has
dropped below 'x', start the service again (but only start it if it
previously stopped it):
May 12 13:02:27 black monit[16951]: 'localhost' loadavg(1min) of 14.5
matches resource limit [loadavg(1min)>2.0]
May 12 13:02:51 black monit[16951]: Monit has not changed
May 12 13:02:51 black monit[16951]: 'localhost' loadavg(1min) of 19.8
matches resource limit [loadavg(1min)>2.0]
(here exec was run)
May 12 13:03:21 black monit[16951]: 'localhost' loadavg(1min) of 16.7
matches resource limit [loadavg(1min)>2.0]
(here exec was run)
May 12 13:03:51 black monit[16951]: 'localhost' loadavg(1min) of 12.7
matches resource limit [loadavg(1min)>2.0]
(here exec was run)
May 12 13:04:21 black monit[16951]: 'localhost' loadavg(1min) of 12.4
matches resource limit [loadavg(1min)>2.0]
(here exec was run)
>
>
> Martin
>
>
> Micah Anderson wrote:
>> If the load on your system goes above a threshold and you know that
>> this is a result of a runaway process that needs to be restarted, will
>> this cause the process to be restarted over and over because the 1
>> minute load average will not drop fast enough to get below the threshold:
>>
>> check system localhost
>> if loadavg (1min) > 25 then exec "/usr/bin/monit apachectl restart"
>>
>> I'm afraid that the load will climb to 35, monit will see this and
>> apache will be restarted, next cycle monit will see that the load
>> average is 30 (because it is going down), and it will issue a restart
>> *again*, the load will continue to drop, monit will see its now 26 and
>> restart apache a third time, when really there is no load problem as
>> the load delta is dropping.
>>
>> Is there a way to make a load dependency that says, "If load gets
>> above 25, stop this process, once the load drops back down below 10
>> things are probably back to normal, so start the process again."?
>>
>> Thanks,
>> Micah
>>
>>
>>
>>
>>
>> --
>> To unsubscribe:
>> http://lists.nongnu.org/mailman/listinfo/monit-general
>
>
>
>
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general