monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Baffling status failure to alert


From: Paul Theodoropoulos
Subject: Baffling status failure to alert
Date: Fri, 11 Mar 2016 16:01:15 -0800
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.0

I'm stumped. I have an ugly little script to alert me if today's backup of a database is smaller than the one from yesterday (and the day before). The script works properly, and I have a simple monit rule in place to alert me if it fails. When monit checks, it reports a failure; that is pushed up to my m/monit server, which also logs the failure. From there, all alerts go to PagerDuty. But I never get alerts from this check.

(Hopefully) all relevant output is below. Some strings have been obfuscated. Note that I have the rule modified to falsely report a failure, for testing.

address@hidden: /etc/monit/conf.d # cat /etc/debian_version
7.9

address@hidden: /etc/monit/conf.d # monit --version
This is Monit version 5.17
Built with ssl, without pam and with large files
Copyright (C) 2001-2016 Tildeslash Ltd. All Rights Reserved.

address@hidden: /etc/monit/conf.d # cat backups
check program backup_failure with path /usr/local/bin/check_backup with timeout 15 seconds
not every "* 14 * * *"
#if status != 0 then alert
if status != 1 then alert

address@hidden: /etc/monit/conf.d # cat /usr/local/bin/check_backup
#!/bin/bash
BACKUP_DIR=/var/backups
cd ${BACKUP_DIR}
BUFILE=`date +%Y_%m_%d`_"group".sql.gz
YDAY_BUFILE=`date --date "1 days ago" +%Y_%m_%d`_"group".sql.gz
DAYBEFORE_YDAY_BUFILE=`date --date "2 days ago" +%Y_%m_%d`_"group".sql.gz
if [ -e "${BUFILE}" ];then
    TDAYSIZE=`du ${BUFILE}|cut -f1`
    YDAYSIZE=`du ${YDAY_BUFILE}|cut -f1`
    DBDAYSIZE=`du ${DAYBEFORE_YDAY_BUFILE}|cut -f1`
    if [ $YDAYSIZE -gt $DBDAYSIZE ];then
    if [ $TDAYSIZE -gt $YDAYSIZE ];then
        exit 0
    fi
    else
        exit 1
    fi
fi

address@hidden:/etc/monit/conf.d #  tail -1 /var/log/daemon.log
Mar 11 15:25:04 localhost monit[10562]: 'backup_failure' '/usr/local/bin/check_backup' failed with exit status (0) -- no output

address@hidden: ~ # monit status|tail -7
Program 'backup_failure'
  status                            Status failed
  monitoring status                 Monitored
  last started                      Fri, 11 Mar 2016 15:42:36
  last exit value                   0
  data collected                    Fri, 11 Mar 2016 15:42:36

What am I missing?
-- 
Paul Theodoropoulos
www.anastrophe.com

reply via email to

[Prev in Thread] Current Thread [Next in Thread]