[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [monit] can't monitor one of my filesystems
From: |
zachlac |
Subject: |
Re: [monit] can't monitor one of my filesystems |
Date: |
Wed, 12 May 2010 06:42:59 -0700 (PDT) |
You are right, it is not mounted. This is the difficulty in building a
monitoring system on machines which I'm not familiar with :)
Thank you.
Martin Pala wrote:
>
> Thanks for output.
>
> It seems that the reason could be, that the device is not mounted - it was
> not found in /etc/mtab. The statvfs() interface which is used to get
> filesystem usage needs path to object which is on the filesystem to be
> tested - hence when device name is used, monit translates it to mountpoint
> using /etc/mtab. There is currently no fine-grained error message for this
> state and it is catched by the test itself which logs general "unable to
> read filesystem /dev/sda2 state".
>
> Please can you check that /dev/sda2 is mounted and that it can be found in
> /etc/mtab?
>
>
> On May 6, 2010, at 9:33 PM, zachlac wrote:
>
>>
>> Here's the output of monit -vl. I do not believe that it's a virtual
>> machine.
>>
>> [EDT May 6 15:28:57] debug : monit: pidfile '/var/run/monit.pid' does
>> not exist
>> [EDT May 6 15:28:57] info : Starting monit daemon with http
>> interface
>> at [www.***************.com:2812]
>> [EDT May 6 15:28:57] info : Starting monit HTTP server at
>> [www.***************.com:2812]
>> [EDT May 6 15:28:57] info : monit HTTP server started
>> [EDT May 6 15:28:57] info : 'www' Monit started
>> [EDT May 6 15:28:57] debug : Monit instance changed notification is
>> sent
>> to address@hidden
>> [EDT May 6 15:28:57] debug : cannot open file /proc/32077/stat -- No
>> such file or directory
>> [EDT May 6 15:28:57] debug : system statistic error -- cannot read
>> /proc/32077/stat
>> [EDT May 6 15:28:57] debug : 'www' cpu wait usage check succeeded
>> [current cpu wait usage=-1.0%]
>> [EDT May 6 15:28:57] debug : 'www' cpu system usage check succeeded
>> [current cpu system usage=-1.0%]
>> [EDT May 6 15:28:57] debug : 'www' cpu user usage check succeeded
>> [current cpu user usage=-1.0%]
>> [EDT May 6 15:28:57] debug : 'www' swap usage check succeeded
>> [current
>> swap usage=0.0%]
>> [EDT May 6 15:28:57] debug : 'www' mem usage check succeeded [current
>> mem usage=34.8%]
>> [EDT May 6 15:28:57] debug : 'www' loadavg(5min) check succeeded
>> [current loadavg(5min)=0.1]
>> [EDT May 6 15:28:57] debug : 'www' loadavg(1min) check succeeded
>> [current loadavg(1min)=0.0]
>> [EDT May 6 15:28:57] debug : 'apache_bin' file existence check
>> succeeded
>> [EDT May 6 15:28:57] debug : 'apache_bin' is a regular file
>> [EDT May 6 15:28:57] debug : 'apache_bin' has valid checksums
>> [EDT May 6 15:28:57] debug : 'apache_bin' permission check succeeded
>> [current permission=0755]
>> [EDT May 6 15:28:57] debug : 'apache_bin' uid check succeeded
>> [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'apache_bin' gid check succeeded
>> [current
>> gid=0]
>> [EDT May 6 15:28:57] debug : 'apache_rc' file existence check
>> succeeded
>> [EDT May 6 15:28:57] debug : 'apache_rc' is a regular file
>> [EDT May 6 15:28:57] debug : 'apache_rc' has valid checksums
>> [EDT May 6 15:28:57] debug : 'apache_rc' permission check succeeded
>> [current permission=0755]
>> [EDT May 6 15:28:57] debug : 'apache_rc' uid check succeeded [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'apache_rc' gid check succeeded [current
>> gid=0]
>> [EDT May 6 15:28:57] debug : 'sendmail_bin' file existence check
>> succeeded
>> [EDT May 6 15:28:57] debug : 'sendmail_bin' is a regular file
>> [EDT May 6 15:28:57] debug : 'sendmail_bin' has valid checksums
>> [EDT May 6 15:28:57] debug : 'sendmail_bin' permission check
>> succeeded
>> [current permission=6755]
>> [EDT May 6 15:28:57] debug : 'sendmail_bin' uid check succeeded
>> [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'sendmail_bin' gid check succeeded
>> [current
>> gid=51]
>> [EDT May 6 15:28:57] debug : 'sendmail_rc' file existence check
>> succeeded
>> [EDT May 6 15:28:57] debug : 'sendmail_rc' is a regular file
>> [EDT May 6 15:28:57] debug : 'sendmail_rc' has valid checksums
>> [EDT May 6 15:28:57] debug : 'sendmail_rc' permission check succeeded
>> [current permission=0755]
>> [EDT May 6 15:28:57] debug : 'sendmail_rc' uid check succeeded
>> [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'sendmail_rc' gid check succeeded
>> [current
>> gid=0]
>> [EDT May 6 15:28:57] debug : 'dovecot_bin' file existence check
>> succeeded
>> [EDT May 6 15:28:57] debug : 'dovecot_bin' is a regular file
>> [EDT May 6 15:28:57] debug : 'dovecot_bin' has valid checksums
>> [EDT May 6 15:28:57] debug : 'dovecot_bin' permission check succeeded
>> [current permission=0755]
>> [EDT May 6 15:28:57] debug : 'dovecot_bin' uid check succeeded
>> [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'dovecot_bin' gid check succeeded
>> [current
>> gid=0]
>> [EDT May 6 15:28:57] debug : 'dovecot_rc' file existence check
>> succeeded
>> [EDT May 6 15:28:57] debug : 'dovecot_rc' is a regular file
>> [EDT May 6 15:28:57] debug : 'dovecot_rc' has valid checksums
>> [EDT May 6 15:28:57] debug : 'dovecot_rc' permission check succeeded
>> [current permission=0755]
>> [EDT May 6 15:28:57] debug : 'dovecot_rc' uid check succeeded
>> [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'dovecot_rc' gid check succeeded
>> [current
>> gid=0]
>> [EDT May 6 15:28:57] debug : 'ntpd_bin' file existence check
>> succeeded
>> [EDT May 6 15:28:57] debug : 'ntpd_bin' is a regular file
>> [EDT May 6 15:28:57] debug : 'ntpd_bin' has valid checksums
>> [EDT May 6 15:28:57] debug : 'ntpd_bin' permission check succeeded
>> [current permission=0755]
>> [EDT May 6 15:28:57] debug : 'ntpd_bin' uid check succeeded [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'ntpd_bin' gid check succeeded [current
>> gid=0]
>> [EDT May 6 15:28:57] debug : 'ntpd_rc' file existence check succeeded
>> [EDT May 6 15:28:57] debug : 'ntpd_rc' is a regular file
>> [EDT May 6 15:28:57] debug : 'ntpd_rc' has valid checksums
>> [EDT May 6 15:28:57] debug : 'ntpd_rc' permission check succeeded
>> [current permission=0755]
>> [EDT May 6 15:28:57] debug : 'ntpd_rc' uid check succeeded [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'ntpd_rc' gid check succeeded [current
>> gid=0]
>> [EDT May 6 15:28:57] debug : 'sshd_bin' file existence check
>> succeeded
>> [EDT May 6 15:28:57] debug : 'sshd_bin' is a regular file
>> [EDT May 6 15:28:57] debug : 'sshd_bin' has valid checksums
>> [EDT May 6 15:28:57] debug : 'sshd_bin' permission check succeeded
>> [current permission=0755]
>> [EDT May 6 15:28:57] debug : 'sshd_bin' uid check succeeded [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'sshd_bin' gid check succeeded [current
>> gid=0]
>> [EDT May 6 15:28:57] debug : 'sshd_rc' file existence check succeeded
>> [EDT May 6 15:28:57] debug : 'sshd_rc' is a regular file
>> [EDT May 6 15:28:57] debug : 'sshd_rc' has valid checksums
>> [EDT May 6 15:28:57] debug : 'sshd_rc' permission check succeeded
>> [current permission=0755]
>> [EDT May 6 15:28:57] debug : 'sshd_rc' uid check succeeded [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'sshd_rc' gid check succeeded [current
>> gid=0]
>> [EDT May 6 15:28:57] debug : 'apache' zombie check succeeded
>> [status_flag=0000]
>> [EDT May 6 15:28:57] debug : 'apache' loadavg(5min) check succeeded
>> [current loadavg(5min)=0.1]
>> [EDT May 6 15:28:57] debug : 'apache' children check succeeded
>> [current
>> children=13]
>> [EDT May 6 15:28:57] debug : 'apache' total mem amount check
>> succeeded
>> [current total mem amount=263024kB]
>> [EDT May 6 15:28:57] debug : 'apache' cpu usage check skipped
>> (initializing)
>> [EDT May 6 15:28:57] debug : [EDT May 6 15:28:57] debug :
>> 'apache'
>> succeeded connecting to INET[www.***************.com:80] via TCP
>> [EDT May 6 15:28:57] debug : 'apache' succeeded testing protocol
>> [HTTP]
>> at INET[www.***************.com:80] via TCP
>> [EDT May 6 15:28:57] debug : 'sendmail' zombie check succeeded
>> [status_flag=0000]
>> [EDT May 6 15:28:57] debug : 'sendmail' succeeded connecting to
>> INET[localhost:25] via TCP
>> [EDT May 6 15:28:57] debug : 'sendmail' succeeded testing protocol
>> [SMTP] at INET[localhost:25] via TCP
>> [EDT May 6 15:28:57] debug : 'dovecot' zombie check succeeded
>> [status_flag=0000]
>> [EDT May 6 15:28:57] debug : 'dovecot' succeeded connecting to
>> INET[localhost:993] via TCPSSL
>> [EDT May 6 15:28:57] debug : 'dovecot' succeeded testing protocol
>> [IMAP]
>> at INET[localhost:993] via TCPSSL
>> [EDT May 6 15:28:57] debug : 'ntp' zombie check succeeded
>> [status_flag=0000]
>> [EDT May 6 15:28:57] debug : 'ssh' zombie check succeeded
>> [status_flag=0000]
>> [EDT May 6 15:28:57] debug : 'datafs_sdb1' permission check succeeded
>> [current permission=0640]
>> [EDT May 6 15:28:57] debug : 'datafs_sdb1' uid check succeeded
>> [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'datafs_sdb1' gid check succeeded
>> [current
>> gid=6]
>> [EDT May 6 15:28:57] debug : 'datafs_sdb1' inode usage check
>> succeeded
>> [current inode usage=1.5%]
>> [EDT May 6 15:28:57] debug : 'datafs_sdb1' inode usage check
>> succeeded
>> [current inode usage=1.5%]
>> [EDT May 6 15:28:57] debug : 'datafs_sdb1' space usage check
>> succeeded
>> [current space usage=69.6%]
>> [EDT May 6 15:28:57] debug : 'datafs_sdb1' space usage check
>> succeeded
>> [current space usage=69.6%]
>> [EDT May 6 15:28:57] debug : 'swap_sdb2' permission check succeeded
>> [current permission=0640]
>> [EDT May 6 15:28:57] debug : 'swap_sdb2' uid check succeeded [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'swap_sdb2' gid check succeeded [current
>> gid=6]
>> [EDT May 6 15:28:57] debug : 'swap_sdb2' inode usage check succeeded
>> [current inode usage=0.1%]
>> [EDT May 6 15:28:57] debug : 'swap_sdb2' inode usage check succeeded
>> [current inode usage=0.1%]
>> [EDT May 6 15:28:57] debug : 'swap_sdb2' space usage check succeeded
>> [current space usage=14.6%]
>> [EDT May 6 15:28:57] debug : 'swap_sdb2' space usage check succeeded
>> [current space usage=14.6%]
>> [EDT May 6 15:28:57] debug : 'boot_sda1' permission check succeeded
>> [current permission=0640]
>> [EDT May 6 15:28:57] debug : 'boot_sda1' uid check succeeded [current
>> uid=0]
>> [EDT May 6 15:28:57] debug : 'boot_sda1' gid check succeeded [current
>> gid=6]
>> [EDT May 6 15:28:57] debug : 'boot_sda1' inode usage check succeeded
>> [current inode usage=0.1%]
>> [EDT May 6 15:28:57] debug : 'boot_sda1' inode usage check succeeded
>> [current inode usage=0.1%]
>> [EDT May 6 15:28:57] debug : 'boot_sda1' space usage check succeeded
>> [current space usage=24.2%]
>> [EDT May 6 15:28:57] debug : 'boot_sda1' space usage check succeeded
>> [current space usage=24.2%]
>> [EDT May 6 15:28:57] error : 'datafs_sda2' unable to read filesystem
>> /dev/sda2 state
>> [EDT May 6 15:28:57] debug : Data access error notification is sent
>> to
>> address@hidden
>> [EDT May 6 15:28:58] debug : 'rootfs_logical' space usage check
>> succeeded [current space usage=35.6%]
>> [EDT May 6 15:28:58] debug : ICMP echo response 1/3 succeeded --
>> received id=38340 sequence=0 response_time=0.000171s
>> [EDT May 6 15:28:58] debug : 'shade' icmp ping succeeded [response
>> time
>> 0.000s]
>> [EDT May 6 15:28:58] debug : 'shade' succeeded connecting to
>> INET[xxx.xxx.xxx.xxx:22] via TCP
>> [EDT May 6 15:28:58] debug : 'shade' succeeded testing protocol [SSH]
>> at
>> INET[xxx.xxx.xxx.xxx:22] via TCP
>> [EDT May 6 15:29:07] debug : HttpRequest error: HTTP/1.0 401 You are
>> not
>> authorized to access monit. Either you supplied the wrong credentials
>> (e.g.
>> bad password), or your browser doesn't understand how to supply the
>> credentials required
>> [EDT May 6 15:29:09] debug : HttpRequest error: HTTP/1.0 404 There is
>> no
>> service by that name
>> [EDT May 6 15:29:13] debug : HttpRequest error: HTTP/1.0 404 There is
>> no
>> service by that name
>>
>>
>>
>> Martin Pala wrote:
>>>
>>> Is the system virtual machine of some type (VPS, etc.?) or real/physical
>>> machine? If it is virtual it is possible that the access is rejected
>>> based
>>> on host OS restrictions. There can be also other access control
>>> restrictions - for example if you use SElinux ...
>>>
>>> The svn repository contains development version of 5.2 in various
>>> development stages (some features may be incomplete) and also the
>>> features
>>> may not been tested yet - the exact codebase depends on when you updated
>>> the source code. The problems which you have shouldn't be specific to
>>> 5.2-development anyway as there were no changes which could exacerbate
>>> like this, but it could be good to verify the behavior with official
>>> 5.1.1
>>> version.
>>>
>>> Please can you also run monit with debug enabled and provide full
>>> output?:
>>>
>>> monit -vI
>>>
>>>
>>>
>>>
>>> On May 4, 2010, at 4:05 PM, zachlac wrote:
>>>
>>>>
>>>> sda2 cannot be monitored, while sda1 can:
>>>>
>>>> # ls -l /dev/sda2
>>>> brw-r----- 1 root disk 8, 2 Feb 19 14:22 /dev/sda2
>>>> # ls -l /dev/sda1
>>>> brw-r----- 1 root disk 8, 2 Feb 19 14:22 /dev/sda1
>>>>
>>>> I'm using the repository version of monit, which is 5.2.
>>>>
>>>> Thank you.
>>>>
>>>>
>>>> Martin Pala wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> LVM shouldn't be problem, please can you provide output of "ls -l
>>>>> /dev/sda2"? Which monit version do you use? There was problem in monit
>>>>> <=
>>>>> 4.10.1 when the device was symlink - the support for device symlinks
>>>>> was
>>>>> added in Monit 5.0 (current version is Monit 5.1.1).
>>>>>
>>>>> Optionally you can use mount point instead of device.
>>>>>
>>>>> Regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> On May 3, 2010, at 6:43 PM, zachlac wrote:
>>>>>
>>>>>>
>>>>>> I have monit monitoring /dev/sdb1, /dev/sdb2, and /dev/sda1.
>>>>>> However,
>>>>>> /dev/sda2 is a Linux LVM, and when I try to monitor it I get a "Data
>>>>>> access
>>>>>> error". My output for fdisk is as follows:
>>>>>> ----------------------------------------------------------------------------------------------
>>>>>> isk /dev/sda: 200.0 GB, 200049647616 bytes
>>>>>> 255 heads, 63 sectors/track, 24321 cylinders
>>>>>> Units = cylinders of 16065 * 512 = 8225280 bytes
>>>>>>
>>>>>> Device Boot Start End Blocks Id System
>>>>>> /dev/sda1 * 1 13 104391 83 Linux
>>>>>> /dev/sda2 14 24321 195254010 8e Linux LVM
>>>>>>
>>>>>> Disk /dev/sdb: 200.0 GB, 200049647616 bytes
>>>>>> 255 heads, 63 sectors/track, 24321 cylinders
>>>>>> Units = cylinders of 16065 * 512 = 8225280 bytes
>>>>>>
>>>>>> Device Boot Start End Blocks Id System
>>>>>> /dev/sdb1 * 1 12160 97675168+ 83 Linux
>>>>>> /dev/sdb2 12161 24321 97683232+ 83 Linux
>>>>>> --------------------------------------------------------------------------------------------
>>>>>>
>>>>>> My monitrc contains the following important lines:
>>>>>> ---------------------------------------------------------------------------------------------
>>>>>> check filesystem boot_sda1 with path /dev/sda1
>>>>>> start program = "/bin/mount /data"
>>>>>> stop program = "/bin/umount /data"
>>>>>> if failed permission 640 then unmonitor
>>>>>> if failed uid root then unmonitor
>>>>>> if failed gid disk then unmonitor
>>>>>> if space usage > 80% for 5 times within 15 cycles then alert
>>>>>> if space usage > 99% then stop
>>>>>> # if inode usage > 30000 then alert
>>>>>> # if inode usage > 250000 then alert
>>>>>> if inode usage > 80% then alert
>>>>>> if inode usage > 99% then stop
>>>>>> group server
>>>>>>
>>>>>> check filesystem datafs_sda2 with path /dev/sda2
>>>>>> start program = "/bin/mount /data"
>>>>>> stop program = "/bin/umount /data"
>>>>>> if failed permission 640 then unmonitor
>>>>>> if failed uid root then unmonitor
>>>>>> if failed gid disk then unmonitor
>>>>>> if space usage > 80% for 5 times within 15 cycles then alert
>>>>>> if space usage > 99% then stop
>>>>>> # if inode usage > 30000 then alert
>>>>>> # if inode usage > 250000 then alert
>>>>>> if inode usage > 80% then alert
>>>>>> if inode usage > 99% then stop
>>>>>> group server
>>>>>> ---------------------------------------------------------------------------------------------
>>>>>>
>>>>>> Why can't I monitor the LVM?
>>>>>>
>>>>>> Thank you.
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://old.nabble.com/-monit--can%27t-monitor-one-of-my-filesystems-tp28437378p28437378.html
>>>>>> Sent from the monit-general mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> To unsubscribe:
>>>>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe:
>>>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/-monit--can%27t-monitor-one-of-my-filesystems-tp28437378p28447734.html
>>>> Sent from the monit-general mailing list archive at Nabble.com.
>>>>
>>>>
>>>>
>>>> --
>>>> To unsubscribe:
>>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>>
>>>
>>> --
>>> To unsubscribe:
>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/-monit--can%27t-monitor-one-of-my-filesystems-tp28437378p28478533.html
>> Sent from the monit-general mailing list archive at Nabble.com.
>>
>>
>>
>> --
>> To unsubscribe:
>> http://lists.nongnu.org/mailman/listinfo/monit-general
>
>
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general
>
>
--
View this message in context:
http://old.nabble.com/-monit--can%27t-monitor-one-of-my-filesystems-tp28437378p28536140.html
Sent from the monit-general mailing list archive at Nabble.com.