monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Question about multi-host testing


From: Pablo Iranzo Gómez
Subject: Re: [monit] Question about multi-host testing
Date: Tue, 30 Oct 2007 21:47:31 +0100 (CET)

        Will try tomorrow, and let you know if this works.
        Again, thank you very much for your analisys and explanation.

        Pablo



-- 
Pablo Iranzo GĂłmez
(http://Alufis35.uv.es/~iranzo/)
(PGPKey Available on http://www.uv.es/~iranzop/PGPKey.pgp)
                  --
Postulado de Boling sobre la Ley de Murphy:

Si se encuentra bien, no se preocupe. Se le pasarĂĄ

On Tue, 30 Oct 2007, Martin Pala wrote:

> The problem is, that the dependency was designed primarily for critical
> actions (start/stop/restart/monitor/unmonitor), where correct order is
> needed.
>
> The alert-only action doesn't trigger the dependency (action chain)
> since it could be just informative.
>
> For example if you are monitoring the icmp, you can have few error
> levels, such as:
>
> --8<--
> check host myrouter with address ...
>    if failed icmp type echo for 3 times within 5 cycles then alert
>    if failed icmp type echo for 5 cycles then exec
> "/script/to/power-cycle/router"
> --8<--
>
> In such case monit sends alert when the network has problems, but is not
> completely dead (part of packets lost) and can recover itself yet. In
> such case this shouldn't disable the monitoring of remote hosts. When
> the error ratio is 100% for 5 cycles (the second icmp line), then it can
> exec for example script to power-cycle the router (networked power
> switch ... point-to-point or on the same ethernet switch to be reachable
> if router is not available).
>
> So, the final solution could be to extend the dependency and make the
> service dependency hard by option even on alert message (to stop
> monitoring the other services).
>
> Workaround could be to define dummy start/stop methods for monitored
> remote hosts and use restart action instead of alert (it sends alert as
> well). Something like:
>
> --8<--
> check host myswitch ...
>    start program = "/bin/true"
>    stop program = "/bin/true"
>    if failed icmp type echo for 5 cycles then restart
>
> check host myrouter ...
>    start program = "/bin/true"
>    stop program = "/bin/true"
>    if failed icmp type echo for 5 cycles then restart
>    depends on myswitch
> --8<--
>
> ... not tested, but can work (although the restart action doesn't look
> logical, it can trigger the dependency in this case as well).
>
>
> Martin
>
>
> Pablo Iranzo Gómez wrote:
> >     List, here is the output from monit running in interactive mode with
> > -vv:
> >
> > From log start:
> > -----------------------------------------------------------------------
> > Remote Host Name      = ro5000-siNmG20876YFyCu20879
> >  Monitoring mode      = active
> >  ICMP                 = if failed Echo Request count 1 with timeout 10
> > seconds 1 times within 1 cycle(s) then alert else if passed 1 times
> > within 1 cycle(s) then alert
> >  Alert mail to        = address@hidden
> >    Alert on           = All events
> >    Alert reminder     = 1 cycles
> >
> > Remote Host Name      = pos10.5000-siNmG20876YFyCu20879
> >  Monitoring mode      = active
> >  Depends on Service   = ro5000-siNmG20876YFyCu20879
> >  ICMP                 = if failed Echo Request count 1 with timeout 10
> > seconds 1 times within 1 cycle(s) then alert else if passed 1 times
> > within 1 cycle(s) then alert
> >  Alert mail to        = address@hidden
> >    Alert on           = All events
> >    Alert reminder     = 1 cycles
> >
> >
> > From Log checking:
> > -----------------------------------------------------------------------
> > 'ro5000-siNmG20876YFyCu20879' icmp ping failed
> > 'ro5000-siNmG20876YFyCu20879' failed ICMP test [Echo Request]
> > ICMP failed notification is sent to address@hidden
> > 'ro5000-siNmG20876YFyCu20879' icmp ping failed, skipping any port
> > connection tests
> > 'pos10.5000-siNmG20876YFyCu20879' icmp ping failed
> > 'pos10.5000-siNmG20876YFyCu20879' failed ICMP test [Echo Request]
> > ICMP failed notification is sent to address@hidden
> > 'pos10.5000-siNmG20876YFyCu20879' icmp ping failed, skipping any port
> > connection tests
> >
> >
> > Config files:
> > -----------------------------------------------------------------------
> > check host ro5000-siNmG20876YFyCu20879 with address 10.39.16.1
> >         if failed ICMP type ECHO count 1 timeout 10 seconds then alert
> >         alert address@hidden with reminder on 1 cycle
> >
> > check host pos10.5000-siNmG20876YFyCu20879 with address 10.39.16.10
> >         if failed ICMP type ECHO count 1 timeout 10 seconds then alert
> >         alert address@hidden with reminder on 1 cycle
> >         depends on ro5000-siNmG20876YFyCu20879
> >
> >
> >
> >     Any hint?
> >
> >     Thanks in advance,
> >     Pablo
> >
> >
> > El lun, 29-10-2007 a las 21:51 +0100, Pablo Iranzo Gómez escribió:
> >>    Martin,
> >>
> >> On Mon, 29 Oct 2007, Martin Pala wrote:
> >>> Can you run monit in verbose mode (-v option) and send the log? You'll
> >>> see in it what happened in more detail.
> >>    Sure, will do it tomorrow early in the morning :)
> >>
> >>>>  If I just put "if failed icmp then alert" monit complains about
> >>>> configuration (I'm using monit-4.9-1), so either I'm doing something
> >>>> wrong or it's a problem with this verson.
> >>> I'm sorry - this was typo (i wrote the example just from memory so, the
> >>> "type echo" was missing).
> >>    Don't worry, I was just trying just in case I did something wrong
> >> :)
> >>
> >>    Thanks again
> >>    Pablo
> >>
> >>
> >> --
> >> To unsubscribe:
> >> http://lists.nongnu.org/mailman/listinfo/monit-general
> >>
> >> ------------------------------------------------------------------------
> >>
> >> --
> >> To unsubscribe:
> >> http://lists.nongnu.org/mailman/listinfo/monit-general
>
>
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]