Xymon Dependancies configuration.
Hi,
I am trying to configure xymon dependencies so that if the core router is down my xymon server only pages me for the core router.
In reading the man page it says to do something like the following:
1.2.3.4 cg1.example.com # noconn https://cg1.example.com depends=(http:router.example.com/conn)
The above works for a single service but the above host for example has http and sslcert. How can I tell xymon that if router.example.com is down all of the other services for a host should go clear?
I tried setting the service to a * that does not work. and I tried listing services separated with either a comma or a pipe but no joy.
Is this even possible? Even adding a 2nd depends statement does not work.
Also is there a way to test the syntax to know if I got it right short of waiting for 5 or 10 minutes to see if it worked?
Regards,
-- Tom me at tdiehl.org
On 03/06/2020 22:49, me at tdiehl.org wrote:
Hi,
I am trying to configure xymon dependencies so that if the core router is down my xymon server only pages me for the core router.
In reading the man page it says to do something like the following:
1.2.3.4 cg1.example.com # noconn https://cg1.example.com depends=(http:router.example.com/conn)
The above works for a single service but the above host for example has http and sslcert. How can I tell xymon that if router.example.com is down all of the other services for a host should go clear?
I tried setting the service to a * that does not work. and I tried listing services separated with either a comma or a pipe but no joy.
"man hosts.cfg" suggests that the syntax you want is
depends=(testA:host1/test1,host2/test2),(testB:host3/test3)
so for your example,
depends=(http:router.example.com/conn),(sslcert:router.example.com/conn)
As the man page says, "depends" only applies to tests performed by xymonnet. Wildcards do not appear to be supported but protocols.cfg will show you most of the tests that xymonnet might perform.
Also is there a way to test the syntax to know if I got it right short of waiting for 5 or 10 minutes to see if it worked?
Because the tests are performed by xymonnet, you can just ran that command yourself. Your tasks.cfg will show you the xymonnet command you're currently running, so something like
xymonnet $(list_of_options) cg1.example.com router.example.com
will test just those two hosts. (NB xymonnet might not be on your path so you may need to explicitly use the full path to the binary) You could add --no-update to have xymonnet dump messages to stdout rather than sending the status message to your xymon server if you prefer.
Adam
Hi,
On Thu, 4 Jun 2020, Adam Thorn wrote:
On 03/06/2020 22:49, me at tdiehl.org wrote:
Hi,
I am trying to configure xymon dependencies so that if the core router is down my xymon server only pages me for the core router.
In reading the man page it says to do something like the following:
1.2.3.4 cg1.example.com # noconn https://cg1.example.com depends=(http:router.example.com/conn)
The above works for a single service but the above host for example has http and sslcert. How can I tell xymon that if router.example.com is down all of the other services for a host should go clear?
I tried setting the service to a * that does not work. and I tried listing services separated with either a comma or a pipe but no joy.
"man hosts.cfg" suggests that the syntax you want is
depends=(testA:host1/test1,host2/test2),(testB:host3/test3)
so for your example,
depends=(http:router.example.com/conn),(sslcert:router.example.com/conn)
That does not work for the sslcert test but does work for things like ssh. Which now makes sense given the info below.
As the man page says, "depends" only applies to tests performed by xymonnet. Wildcards do not appear to be supported but protocols.cfg will show you most of the tests that xymonnet might perform.
Ok, that explains why the neither the conn or sslcert test will not go clear. Neither test is listed in protocols.cfg. Given that both of these tests are network type tests it seems odd that they cannot be made to go clear on failure of another network test. I guess I do not really understand how Xymon works.
I was really hoping to be able to get a single alert when the router went down. It does not happen real often but it is a pita to get several hundred text messages for what is really a single failure.
Does anyone have a solution for these kinds of failures?
Regards,
-- Tom me at tdiehl.org
On Thu, Jun 4, 2020 at 3:36 PM <me at tdiehl.org> wrote:
Hi,
On Thu, 4 Jun 2020, Adam Thorn wrote:
On 03/06/2020 22:49, me at tdiehl.org wrote:
Hi,
I am trying to configure xymon dependencies so that if the core router is down my xymon server only pages me for the core router.
In reading the man page it says to do something like the following:
1.2.3.4 cg1.example.com # noconn https://cg1.example.com depends=(http:router.example.com/conn)
The above works for a single service but the above host for example has http and sslcert. How can I tell xymon that if router.example.com is down all of the other services for a host should go clear?
I tried setting the service to a * that does not work. and I tried listing services separated with either a comma or a pipe but no joy.
"man hosts.cfg" suggests that the syntax you want is
depends=(testA:host1/test1,host2/test2),(testB:host3/test3)
so for your example,
depends=(http:router.example.com/conn),(sslcert:router.example.com/conn)
That does not work for the sslcert test but does work for things like ssh. Which now makes sense given the info below.
As the man page says, "depends" only applies to tests performed by
xymonnet.
Wildcards do not appear to be supported but protocols.cfg will show you most of the tests that xymonnet might perform.
Ok, that explains why the neither the conn or sslcert test will not go clear. Neither test is listed in protocols.cfg. Given that both of these tests are network type tests it seems odd that they cannot be made to go clear on failure of another network test. I guess I do not really understand how Xymon works.
I was really hoping to be able to get a single alert when the router went down. It does not happen real often but it is a pita to get several hundred text messages for what is really a single failure.
Does anyone have a solution for these kinds of failures?
You could write an external script to connect to the router and "do stuff" if the connection fails.
For example, if you're checking the router every 5 minutes, when it fails you could send a "disable" message to Xymon for the list of things behind the router, with a 10 minute lifetime. That'll turn off alerts for all those devices. As long as the router continues to fail, keep on sending disables with 10 min lifetime, essentially extending the original lifetime. Once the router recovers, the disable message will expire up to 10 mins later and those devices will alert or not depending on their next status.
I don't have such a script, but it feels like it ought to be fairly trivial to implement.
Ralph Mitchell
On Thu, 4 Jun 2020, Ralph M wrote:
On Thu, Jun 4, 2020 at 3:36 PM <me at tdiehl.org> wrote:
Hi,
On Thu, 4 Jun 2020, Adam Thorn wrote:
On 03/06/2020 22:49, me at tdiehl.org wrote:
Hi,
I am trying to configure xymon dependencies so that if the core router is down my xymon server only pages me for the core router.
In reading the man page it says to do something like the following:
1.2.3.4 cg1.example.com # noconn https://cg1.example.com depends=(http:router.example.com/conn)
The above works for a single service but the above host for example has http and sslcert. How can I tell xymon that if router.example.com is down all of the other services for a host should go clear?
I tried setting the service to a * that does not work. and I tried listing services separated with either a comma or a pipe but no joy.
"man hosts.cfg" suggests that the syntax you want is
depends=(testA:host1/test1,host2/test2),(testB:host3/test3)
so for your example,
depends=(http:router.example.com/conn),(sslcert:router.example.com/conn)
That does not work for the sslcert test but does work for things like ssh. Which now makes sense given the info below.
As the man page says, "depends" only applies to tests performed by
xymonnet.
Wildcards do not appear to be supported but protocols.cfg will show you most of the tests that xymonnet might perform.
Ok, that explains why the neither the conn or sslcert test will not go clear. Neither test is listed in protocols.cfg. Given that both of these tests are network type tests it seems odd that they cannot be made to go clear on failure of another network test. I guess I do not really understand how Xymon works.
I was really hoping to be able to get a single alert when the router went down. It does not happen real often but it is a pita to get several hundred text messages for what is really a single failure.
Does anyone have a solution for these kinds of failures?
You could write an external script to connect to the router and "do stuff" if the connection fails.
For example, if you're checking the router every 5 minutes, when it fails you could send a "disable" message to Xymon for the list of things behind the router, with a 10 minute lifetime. That'll turn off alerts for all those devices. As long as the router continues to fail, keep on sending disables with 10 min lifetime, essentially extending the original lifetime. Once the router recovers, the disable message will expire up to 10 mins later and those devices will alert or not depending on their next status.
I don't have such a script, but it feels like it ought to be fairly trivial to implement.
Thanks, I think I will explore that.
Regards,
-- Tom me at tdiehl.org
On Thu, 4 Jun 2020, Ralph M wrote:
On Thu, Jun 4, 2020 at 3:36 PM <me at tdiehl.org> wrote:
Hi,
On Thu, 4 Jun 2020, Adam Thorn wrote:
On 03/06/2020 22:49, me at tdiehl.org wrote:
Hi,
I am trying to configure xymon dependencies so that if the core router is down my xymon server only pages me for the core router.
In reading the man page it says to do something like the following:
1.2.3.4 cg1.example.com # noconn https://cg1.example.com depends=(http:router.example.com/conn)
The above works for a single service but the above host for example has http and sslcert. How can I tell xymon that if router.example.com is down all of the other services for a host should go clear?
I tried setting the service to a * that does not work. and I tried listing services separated with either a comma or a pipe but no joy.
"man hosts.cfg" suggests that the syntax you want is
depends=(testA:host1/test1,host2/test2),(testB:host3/test3)
so for your example,
depends=(http:router.example.com/conn),(sslcert:router.example.com/conn)
That does not work for the sslcert test but does work for things like ssh. Which now makes sense given the info below.
As the man page says, "depends" only applies to tests performed by
xymonnet.
Wildcards do not appear to be supported but protocols.cfg will show you most of the tests that xymonnet might perform.
Ok, that explains why the neither the conn or sslcert test will not go clear. Neither test is listed in protocols.cfg. Given that both of these tests are network type tests it seems odd that they cannot be made to go clear on failure of another network test. I guess I do not really understand how Xymon works.
I was really hoping to be able to get a single alert when the router went down. It does not happen real often but it is a pita to get several hundred text messages for what is really a single failure.
Does anyone have a solution for these kinds of failures?
You could write an external script to connect to the router and "do stuff" if the connection fails.
For example, if you're checking the router every 5 minutes, when it fails you could send a "disable" message to Xymon for the list of things behind the router, with a 10 minute lifetime. That'll turn off alerts for all those devices. As long as the router continues to fail, keep on sending disables with 10 min lifetime, essentially extending the original lifetime. Once the router recovers, the disable message will expire up to 10 mins later and those devices will alert or not depending on their next status.
I don't have such a script, but it feels like it ought to be fairly trivial to implement.
In preparation for writing a script to do what I need, I have been playing with xymon commands.
If I send the following to xymon it appears to be ignoring the lifetime parameter:
/usr/bin/xymon 127.0.0.1 "status+10m EMD1-2,example,com.conn clear date test message"
The above command will send a status message to xymon but is only stays clear for approx 30 seconds. If I am reading the man page correctly it should stay clear for 10 minutes. Does anyone know what I am missing?
Regards,
-- Tom me at tdiehl.org
On Sat, Jun 6, 2020 at 3:36 PM <me at tdiehl.org> wrote:
On Thu, 4 Jun 2020, Ralph M wrote:
On Thu, Jun 4, 2020 at 3:36 PM <me at tdiehl.org> wrote:
Hi,
On Thu, 4 Jun 2020, Adam Thorn wrote:
On 03/06/2020 22:49, me at tdiehl.org wrote:
Hi,
I am trying to configure xymon dependencies so that if the core router is down my xymon server only pages me for the core router.
In reading the man page it says to do something like the following:
1.2.3.4 cg1.example.com # noconn https://cg1.example.com depends=(http:router.example.com/conn)
The above works for a single service but the above host for example has http and sslcert. How can I tell xymon that if router.example.com is down all of the other services for a host should go clear?
I tried setting the service to a * that does not work. and I tried listing services separated with either a comma or a pipe but no joy.
"man hosts.cfg" suggests that the syntax you want is
depends=(testA:host1/test1,host2/test2),(testB:host3/test3)
so for your example,
depends=(http:router.example.com/conn),(sslcert: router.example.com/conn)
That does not work for the sslcert test but does work for things like ssh. Which now makes sense given the info below.
As the man page says, "depends" only applies to tests performed by
xymonnet.
Wildcards do not appear to be supported but protocols.cfg will show you most of the tests that xymonnet might perform.
Ok, that explains why the neither the conn or sslcert test will not go clear. Neither test is listed in protocols.cfg. Given that both of these tests are network type tests it seems odd that they cannot be made to go clear on failure of another network test. I guess I do not really understand how Xymon works.
I was really hoping to be able to get a single alert when the router went down. It does not happen real often but it is a pita to get several hundred text messages for what is really a single failure.
Does anyone have a solution for these kinds of failures?
You could write an external script to connect to the router and "do stuff" if the connection fails.
For example, if you're checking the router every 5 minutes, when it fails you could send a "disable" message to Xymon for the list of things behind the router, with a 10 minute lifetime. That'll turn off alerts for all those devices. As long as the router continues to fail, keep on sending disables with 10 min lifetime, essentially extending the original lifetime. Once the router recovers, the disable message will expire up to 10 mins later and those devices will alert or not depending on their next status.
I don't have such a script, but it feels like it ought to be fairly trivial to implement.
In preparation for writing a script to do what I need, I have been playing with xymon commands.
If I send the following to xymon it appears to be ignoring the lifetime parameter: /usr/bin/xymon 127.0.0.1 "status+10m EMD1-2,example,com.conn clear
datetest message"The above command will send a status message to xymon but is only stays clear for approx 30 seconds. If I am reading the man page correctly it should stay clear for 10 minutes. Does anyone know what I am missing?
Is Xymon pinging that host? Its message would override your message. Try inventing a whole new column for the testing process, that way you can be sure it shouldn't flip state unexpectedly. Just replace .conn with any other string that is not used for a test.
For example:
/usr/bin/xymon 127.0.0.1 "status+10m EMD1-2,example,com.tdiehl clear
date test message"
Best to keep it alphanumeric, but apart from that, use any word or random string you like. Bear in mind that the longer the string, the wider the display, so you might start running off the edge of your display. This may or may not be important to you now, but if you add enough custom tests, the sideways scrolling can be tiresome... :)
Ralph Mitchell
On Sat, 6 Jun 2020, Ralph M wrote:
On Sat, Jun 6, 2020 at 3:36 PM <me at tdiehl.org> wrote:
On Thu, 4 Jun 2020, Ralph M wrote:
On Thu, Jun 4, 2020 at 3:36 PM <me at tdiehl.org> wrote:
Hi,
On Thu, 4 Jun 2020, Adam Thorn wrote:
On 03/06/2020 22:49, me at tdiehl.org wrote:
Hi,
I am trying to configure xymon dependencies so that if the core router is down my xymon server only pages me for the core router.
In reading the man page it says to do something like the following:
1.2.3.4 cg1.example.com # noconn https://cg1.example.com depends=(http:router.example.com/conn)
The above works for a single service but the above host for example has http and sslcert. How can I tell xymon that if router.example.com is down all of the other services for a host should go clear?
I tried setting the service to a * that does not work. and I tried listing services separated with either a comma or a pipe but no joy.
"man hosts.cfg" suggests that the syntax you want is
depends=(testA:host1/test1,host2/test2),(testB:host3/test3)
so for your example,
depends=(http:router.example.com/conn),(sslcert: router.example.com/conn)
That does not work for the sslcert test but does work for things like ssh. Which now makes sense given the info below.
As the man page says, "depends" only applies to tests performed by
xymonnet.
Wildcards do not appear to be supported but protocols.cfg will show you most of the tests that xymonnet might perform.
Ok, that explains why the neither the conn or sslcert test will not go clear. Neither test is listed in protocols.cfg. Given that both of these tests are network type tests it seems odd that they cannot be made to go clear on failure of another network test. I guess I do not really understand how Xymon works.
I was really hoping to be able to get a single alert when the router went down. It does not happen real often but it is a pita to get several hundred text messages for what is really a single failure.
Does anyone have a solution for these kinds of failures?
You could write an external script to connect to the router and "do stuff" if the connection fails.
For example, if you're checking the router every 5 minutes, when it fails you could send a "disable" message to Xymon for the list of things behind the router, with a 10 minute lifetime. That'll turn off alerts for all those devices. As long as the router continues to fail, keep on sending disables with 10 min lifetime, essentially extending the original lifetime. Once the router recovers, the disable message will expire up to 10 mins later and those devices will alert or not depending on their next status.
I don't have such a script, but it feels like it ought to be fairly trivial to implement.
In preparation for writing a script to do what I need, I have been playing with xymon commands.
If I send the following to xymon it appears to be ignoring the lifetime parameter: /usr/bin/xymon 127.0.0.1 "status+10m EMD1-2,example,com.conn clear
datetest message"The above command will send a status message to xymon but is only stays clear for approx 30 seconds. If I am reading the man page correctly it should stay clear for 10 minutes. Does anyone know what I am missing?
Is Xymon pinging that host? Its message would override your message. Try inventing a whole new column for the testing process, that way you can be sure it shouldn't flip state unexpectedly. Just replace .conn with any other string that is not used for a test.
That is what I am seeing. It does not matter if the service is red or green set it to clear and it switches to red or green in under 30 seconds. In thinking about this that makes sense. Originally I thought that the message sent from the xymon command line would override the automatic updates for whatever the lifetime on the message was set. Obviously I was wrong.
For example:
/usr/bin/xymon 127.0.0.1 "status+10m EMD1-2,example,com.tdiehl clear
datetest message"Best to keep it alphanumeric, but apart from that, use any word or random string you like. Bear in mind that the longer the string, the wider the display, so you might start running off the edge of your display. This may or may not be important to you now, but if you add enough custom tests, the sideways scrolling can be tiresome... :)
That makes sense but for my purposes I am thinking instead of using clear I will disable the test. Blue is still better than red. :-)
Thanks for the confirmation and help.
Regards,
-- Tom me at tdiehl.org
Tom, I understand about the tons of msgs. But, in my environment this is usually a firewall issue. I know this because of the systems behind the router/firewall are still able to send data to the xymon server. It's just the xymon server can not get to the systems. Just some food for thought...
Dave
On Thu, Jun 4, 2020 at 3:36 PM <me at tdiehl.org> wrote:
Hi,
On Thu, 4 Jun 2020, Adam Thorn wrote:
On 03/06/2020 22:49, me at tdiehl.org wrote:
Hi,
I am trying to configure xymon dependencies so that if the core router is down my xymon server only pages me for the core router.
In reading the man page it says to do something like the following:
1.2.3.4 cg1.example.com # noconn https://cg1.example.com depends=(http:router.example.com/conn)
The above works for a single service but the above host for example has http and sslcert. How can I tell xymon that if router.example.com is down all of the other services for a host should go clear?
I tried setting the service to a * that does not work. and I tried listing services separated with either a comma or a pipe but no joy.
"man hosts.cfg" suggests that the syntax you want is
depends=(testA:host1/test1,host2/test2),(testB:host3/test3)
so for your example,
depends=(http:router.example.com/conn),(sslcert:router.example.com/conn)
That does not work for the sslcert test but does work for things like ssh. Which now makes sense given the info below.
As the man page says, "depends" only applies to tests performed by
xymonnet.
Wildcards do not appear to be supported but protocols.cfg will show you most of the tests that xymonnet might perform.
Ok, that explains why the neither the conn or sslcert test will not go clear. Neither test is listed in protocols.cfg. Given that both of these tests are network type tests it seems odd that they cannot be made to go clear on failure of another network test. I guess I do not really understand how Xymon works.
I was really hoping to be able to get a single alert when the router went down. It does not happen real often but it is a pita to get several hundred text messages for what is really a single failure.
Does anyone have a solution for these kinds of failures?
Regards,
-- Tom me at tdiehl.org
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
On Thu, 4 Jun 2020, David Boyer wrote:
Tom, I understand about the tons of msgs. But, in my environment this is usually a firewall issue. I know this because of the systems behind the router/firewall are still able to send data to the xymon server. It's just the xymon server can not get to the systems. Just some food for thought...
I understand but the problem I am trying to solve is that when for whatever reason the power/internet goes down and there is no connection for anything that may or may not be running behind the firewall I do not want/need 100+ alerts. To make matters worse, several of my customers are located within the same neighborhood, so when one customer goes down, the others follow.
My customers generally have a single internet feed with no backup power. I originally thought disabling dependent services would be the simple answer. Although it helps reduce the number of alerts I need to come up with a different solution to completely eliminate the unnecessary alerts.
Ralph M suggested that I might be able to script something that effectively does what I want. I am going to explore that as it does not sound to complex. I just need to find the time to poke at this.
Regards,
-- Tom me at tdiehl.org
Dave
On Thu, Jun 4, 2020 at 3:36 PM <me at tdiehl.org> wrote:
Hi,
On Thu, 4 Jun 2020, Adam Thorn wrote:
On 03/06/2020 22:49, me at tdiehl.org wrote:
Hi,
I am trying to configure xymon dependencies so that if the core router is down my xymon server only pages me for the core router.
In reading the man page it says to do something like the following:
1.2.3.4 cg1.example.com # noconn https://cg1.example.com depends=(http:router.example.com/conn)
The above works for a single service but the above host for example has http and sslcert. How can I tell xymon that if router.example.com is down all of the other services for a host should go clear?
I tried setting the service to a * that does not work. and I tried listing services separated with either a comma or a pipe but no joy.
"man hosts.cfg" suggests that the syntax you want is
depends=(testA:host1/test1,host2/test2),(testB:host3/test3)
so for your example,
depends=(http:router.example.com/conn),(sslcert:router.example.com/conn)
That does not work for the sslcert test but does work for things like ssh. Which now makes sense given the info below.
As the man page says, "depends" only applies to tests performed by
xymonnet.
Wildcards do not appear to be supported but protocols.cfg will show you most of the tests that xymonnet might perform.
Ok, that explains why the neither the conn or sslcert test will not go clear. Neither test is listed in protocols.cfg. Given that both of these tests are network type tests it seems odd that they cannot be made to go clear on failure of another network test. I guess I do not really understand how Xymon works.
I was really hoping to be able to get a single alert when the router went down. It does not happen real often but it is a pita to get several hundred text messages for what is really a single failure.
Does anyone have a solution for these kinds of failures?
Regards,
-- Tom me at tdiehl.org
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
participants (4)
-
alt36@cam.ac.uk
-
davieb@gmail.com
-
me@tdiehl.org
-
ralphmitchell@gmail.com