monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "for X cycles" has no effect


From: Martin Pala
Subject: Re: "for X cycles" has no effect
Date: Wed, 19 Oct 2016 08:32:56 +0200

You can disable logging (remove "set logfile") and use the "exec" action instead to write error message, when limit is exceeded.



On 18 Oct 2016, at 23:21, address@hidden wrote:

I was confused about the difference between logging the limit exceeded and alerts.

I want only the alerts to be logged.  I do not want the alerts emailed.

Is there a way to configure Monit to log alerts but not log when the limits are exceeded?



On Oct 17, 2016, at 11:44 AM, Martin Pala <address@hidden> wrote:

The error message is logged each cycle the limit is exceeded - that is normal.

The cycles limit is related to the action - in your case it's alert action, but there is no alert target in your configuration file, so no action is performed after 5 cycles (the service will just enter the error state).

You can check that the cpu usage error is not set during the first four cycles using for example "monit status". If you want to get notification after 5 cycles (and see the alert action logged too), you need to add "set alert <address>" statement and "set mailserver"



On 17 Oct 2016, at 19:40, address@hidden wrote:

I changed monitrc so that all checks are after 5 cycles and see the same behavior, i.e., there is a cpu usage alert after 5 seconds and every 5 seconds thereafter.

set daemon  5
set logfile syslog
set pidfile /home/jones/monit_code/.monit.pid
set idfile  /home/jones/monit_code/.monit.id
set statefile /home/jones/monit_code/.monit.state
check device var with path /var
    if space usage > 95% for 5 cycles then alert
check device etc with path /etc
    if space usage > 95% for 5 cycles then alert
check system $HOST
    if memory usage > 85% for 5 cycles then alert
    if cpu usage (user) > 75% for 5 cycles then alert
    if cpu usage (system) > 65% for 5 cycles then alert
    if cpu > 0% for 5 cycles then alert


On Oct 17, 2016, at 3:21 AM, Martin Pala <address@hidden> wrote:

Thanks for data.

I think the cpu usage alert was triggered by one of other cpu usage tests, which you have in your monit configuration file:

    if cpu usage (user) > 75% for 1 cycles then alert
    if cpu usage (system) > 65% for 1 cycles then alert
    if cpu > 0% for 5 cycles then alert

There are three independent tests in the config:

1.) "cpu usage (user)"
2.) "cpu usage (system)"
3.) "cpu"

The CPU test #3 (with no options) will check total CPU usage (usr% + sys% + wait%). Tests #1 test usr% only, and #2 tests sys% only.

Only the check #3 has the 5 cycles constraint, the other will alert during first cycle.

Best regards,
Martin


On 13 Oct 2016, at 21:16, address@hidden wrote:

Using the precompiled binary,monit-5.19.0-linux-x86.tar.gz, I see the same behavior.


On Oct 10, 2016, at 10:41 AM, address@hidden wrote:

monitrc is attached.

$ bin/monit -vI -c monitrc
Runtime constants:
 Control file       = /home/jones/monit519/monitrc
 Log file           = syslog
 Pid file           = /home/jones/monit_code/.monit.pid
 Id file            = /home/jones/monit_code/.monit.id
 State file         = /home/jones/monit_code/.monit.state
 Debug              = True
 Log                = True
 Use syslog         = True
 Is Daemon          = True
 Use process engine = True
 Limits             = {
                    =   programOutput:     512 B
                    =   sendExpectBuffer:  256 B
                    =   fileContentBuffer: 512 B
                    =   httpContentBuffer: 1024 kB
                    =   networkTimeout:    5 s
                    = }
 Poll time          = 5 seconds with start delay 0 seconds
 Start monit httpd  = False

The service list contains the following entries:

Filesystem Name       = var
 Path                 = /var
 Monitoring mode      = active
 On reboot            = start
 Filesystem flags     = if changed then alert
 Space usage limit    = if greater than 95.0% then alert

Filesystem Name       = etc
 Path                 = /etc
 Monitoring mode      = active
 On reboot            = start
 Filesystem flags     = if changed then alert
 Space usage limit    = if greater than 95.0% then alert

System Name           = localhost.localdomain
 Monitoring mode      = active
 On reboot            = start
 CPU usage limit      = if greater than 0.0% for 5 cycles then alert
 CPU system limit     = if greater than 65.0% then alert
 CPU user limit       = if greater than 75.0% then alert
 Memory usage limit   = if greater than 85.0% then alert

-------------------------------------------------------------------------------
pidfile '/home/jones/monit_code/.monit.pid' does not exist
Starting Monit 5.19.0 daemon
'localhost.localdomain' Monit 5.19.0 started
'var' succeeded getting filesystem statistics for '/var'
'var' filesystem flags has not changed
'var' space usage test succeeded [current space usage=32.2%]
'etc' succeeded getting filesystem statistics for '/etc'
'etc' filesystem flags has not changed
'etc' space usage test succeeded [current space usage=32.2%]
'localhost.localdomain' cpu usage check succeeded [current cpu usage=0.0%]
'localhost.localdomain' cpu system usage check skipped (initializing)
'localhost.localdomain' cpu user usage check skipped (initializing)
'localhost.localdomain' mem usage check succeeded [current mem usage=37.8%]
'var' succeeded getting filesystem statistics for '/var'
'var' filesystem flags has not changed
'var' space usage test succeeded [current space usage=32.2%]
'etc' succeeded getting filesystem statistics for '/etc'
'etc' filesystem flags has not changed
'etc' space usage test succeeded [current space usage=32.2%]
'localhost.localdomain' cpu usage of 2.4% matches resource limit [cpu usage>0.0%]
'localhost.localdomain' cpu system usage check succeeded [current cpu system usage=0.8%]
'localhost.localdomain' cpu user usage check succeeded [current cpu user usage=1.4%]
'localhost.localdomain' mem usage check succeeded [current mem usage=37.8%]
'var' succeeded getting filesystem statistics for '/var'
'var' filesystem flags has not changed
'var' space usage test succeeded [current space usage=32.2%]
'etc' succeeded getting filesystem statistics for '/etc'
'etc' filesystem flags has not changed
'etc' space usage test succeeded [current space usage=32.2%]
'localhost.localdomain' cpu usage of 2.0% matches resource limit [cpu usage>0.0%]
'localhost.localdomain' cpu system usage check succeeded [current cpu system usage=0.6%]
'localhost.localdomain' cpu user usage check succeeded [current cpu user usage=1.1%]
'localhost.localdomain' mem usage check succeeded [current mem usage=37.8%]
'var' succeeded getting filesystem statistics for '/var'
'var' filesystem flags has not changed
'var' space usage test succeeded [current space usage=32.2%]
'etc' succeeded getting filesystem statistics for '/etc'
'etc' filesystem flags has not changed
'etc' space usage test succeeded [current space usage=32.2%]
'localhost.localdomain' cpu usage of 2.0% matches resource limit [cpu usage>0.0%]
'localhost.localdomain' cpu system usage check succeeded [current cpu system usage=0.6%]
'localhost.localdomain' cpu user usage check succeeded [current cpu user usage=0.9%]
'localhost.localdomain' mem usage check succeeded [current mem usage=37.8%]
'var' succeeded getting filesystem statistics for '/var'
'var' filesystem flags has not changed
'var' space usage test succeeded [current space usage=32.2%]
'etc' succeeded getting filesystem statistics for '/etc'
'etc' filesystem flags has not changed
'etc' space usage test succeeded [current space usage=32.2%]
'localhost.localdomain' cpu usage of 2.4% matches resource limit [cpu usage>0.0%]
'localhost.localdomain' cpu system usage check succeeded [current cpu system usage=0.7%]
'localhost.localdomain' cpu user usage check succeeded [current cpu user usage=1.0%]
'localhost.localdomain' mem usage check succeeded [current mem usage=37.8%]
'var' succeeded getting filesystem statistics for '/var'
'var' filesystem flags has not changed
'var' space usage test succeeded [current space usage=32.2%]
'etc' succeeded getting filesystem statistics for '/etc'
'etc' filesystem flags has not changed
'etc' space usage test succeeded [current space usage=32.2%]
'localhost.localdomain' cpu usage of 2.0% matches resource limit [cpu usage>0.0%]
'localhost.localdomain' cpu system usage check succeeded [current cpu system usage=0.6%]
'localhost.localdomain' cpu user usage check succeeded [current cpu user usage=1.0%]
'localhost.localdomain' mem usage check succeeded [current mem usage=37.8%]
'var' succeeded getting filesystem statistics for '/var'
'var' filesystem flags has not changed
'var' space usage test succeeded [current space usage=32.2%]
'etc' succeeded getting filesystem statistics for '/etc'
'etc' filesystem flags has not changed
'etc' space usage test succeeded [current space usage=32.2%]
'localhost.localdomain' cpu usage of 2.4% matches resource limit [cpu usage>0.0%]
'localhost.localdomain' cpu system usage check succeeded [current cpu system usage=0.6%]
'localhost.localdomain' cpu user usage check succeeded [current cpu user usage=0.9%]
'localhost.localdomain' mem usage check succeeded [current mem usage=37.8%]


Syslog output.

Oct 10 10:25:32 localhost monit[16217]: Starting Monit 5.19.0 daemon
Oct 10 10:25:32 localhost monit[16217]: 'localhost.localdomain' Monit 5.19.0 started
Oct 10 10:25:37 localhost monit[16217]: 'localhost.localdomain' cpu usage of 2.4% matches resource limit [cpu usage>0.0%]
Oct 10 10:25:42 localhost monit[16217]: 'localhost.localdomain' cpu usage of 2.0% matches resource limit [cpu usage>0.0%]
Oct 10 10:25:47 localhost monit[16217]: 'localhost.localdomain' cpu usage of 2.0% matches resource limit [cpu usage>0.0%]
Oct 10 10:25:52 localhost monit[16217]: 'localhost.localdomain' cpu usage of 2.4% matches resource limit [cpu usage>0.0%]
Oct 10 10:25:57 localhost monit[16217]: 'localhost.localdomain' cpu usage of 2.0% matches resource limit [cpu usage>0.0%]
Oct 10 10:26:03 localhost monit[16217]: 'localhost.localdomain' cpu usage of 2.4% matches resource limit [cpu usage>0.0%]




<monitrc>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]