monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Question about monitoring fs


From: Bryan Harris
Subject: Question about monitoring fs
Date: Mon, 7 Nov 2016 14:25:38 -0500

Hi folks,

We've run into a weird situation, where our SAN does "something"
during the nightly backups causing our file systems to stay in
"quiesced" mode forever (the same thing as fsfreeze -f).  It happened
after an update of the SAN firmware, so right now the storage folks
are working with the vendor.

In the meantime we have had some random server freeze situations.  So
far the best I can manage is a whole bunch of Monit checks where all
servers are touching a file on all the other servers.  It seems like a
lot of overkill but in a fs freeze type situation I will still get an
email.

check program <server>-sanity
  with path "/usr/bin/ssh <server> touch .sanity"
  as uid "account" as gid "account" timeout 1 second
  alert address@hidden not on { instance, action }
  with reminder on 30 cycles
  if status != 0 then alert

Does anybody recommend a simpler solution?  This just seems like a
little bit much to see whether the file system got stuck.  The only
problem with a "self-check" situation is that postfix won't send an
email when the file system is stuck like this, so it appears that I
need something external.

Thanks for any advice.

V/r,
Bryan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]