[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Some feature notes for monit, volume III
Re: Some feature notes for monit, volume III
Wed, 14 Jul 2004 20:02:47 +0200
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124
>> ### Monit monitoring
>> Is there some way to detect that monit is successfully running? I
>> would like to ensure (from the cron) that the monit process does
>> not merely exist, but actually does the checks it should. That way
>> I would have a protection of cron-monit pair in both directions
>> (checking the cron's log timestamp from monit).
> This catch 22 situation has been discussed before and I think that
> the conclusion was to run monit from init (/etc/inittab). See also
> the FAQ for rationale. A corollary was recently discussed on this
> list, i.e. that monit may and should start rc-scripts. Although this
> should work without to much problem now, it would help to have a way
> to specify the start order of daemons. We will add this to monit..
Although monit may be run from init (I'm aware of this) and such process
starting is usually seen as assured, there is unfortunatelly a several
reasons why certainly started monit could fail to continuously monitor
the system. For example there can be bug in monit which can be triggered
by some system state, and AFAIK init will finally give up respawning a
process. I'm to spend hours and hours configuring monitoring of my
servers and it would be really sad realizing after the crash that monit
was unable to inform me because it was dead...
>From my point of view, it's always good to be able to monitor a hearbeat
of something that awakens only when there are problems. Touching the
file every cycle is IMHO a good and cheap probe at the active tail of
monit. But I know there are only a few other admins so fearful as I am,
so I'm not proding this idea further, just want to explain why I wrote
this. There are certainly other (more complex) ways to solve my concern. :-)
>> ### Service textual comments
>> The monit web or e-mail output may sometimes be (temporarily) read
>> by nontechnical people. Attaching a comment to each service
>> explaining what is this item or exactly what to do in case of the
>> failure may be handy. What about something like this:
>> check file ... "/proc/mdstat" desc "Disk array check. In case of
>> any failure, please contact any admin ASAP! Telephone..."
> You may specify a special mail format per service and I think this
> should cover this request?
Of course. :-) I forgot to add <description> tag to the output format
mock-ups below. It would IMHO fit to the reported structure, especially
in case monit will be someday able to put the information-rich
XML/TEXT/HTML output into the e-mail alerts. But this feature is just a
"nice addon", of course. The unexperienced human reader should be
competent and educated at least to know what to do if he/she sees red color.
I agree, that all software, especially of this kind, should stay fully
functional and all feature proposal must be carefully considered,
whether they become used by most or not. I don't feel doomy because of
rejected proposals. I write these to initiate a discussion that can
potentially lead to some better solution. But sadly either my posts do
not worth longer discussion or there are not many people wishing to
discuss at all. :-)
>> ### Statistics
>> There are special utilities that gather statistical data from
>> various parts of UNIX systems. MRTG, snmpd are examples. But when I
>> already run monit on my server and it is periodically getting most
>> of this data I'm interested to become graphs possibly, isn't it
>> unnecessary to run another daemon just to do a duplicate
>> get-this-valueset job?
> I admit that it's tempting to add stuff like this to monit. However I
> think this is best served by another program. This way monit stays
> orthogonal to it's main purpose without to much bloat. We plan on
> extending m/monit to handle statistical data and presentations. I.e.
> monit only post events to m/monit which will store the information in
> it's database and handle the presentation.
Okay, that's your and ultimate decision, I have to respect. But still,
the retrieving code is there (at will be more of it), the proper waking
up is there, logging is there... as you already said... tempting. :-)
>> I think it would be much better to unify functions that read RAM
>> structures to one skeleton and call the tag printing routines in
>> accordance to the format requested. This way all formats would
>> provide the same set of information and there is only one code to
> Excellent suggestion.
Description: OpenPGP digital signature