Two DNS lookups for a server but one fails
Hi all,
I have a number of servers that run their own DNS service, each giving out a single IP address. These DNS servers are monitored in bb-hosts like this (names and Ips changed to protect the guilty):
1.2.3.4 dns1.server.com # smtp dns=a:smtp.server.com,ns:smtp.server.com
The issue is that this configuration performs two DNS lookups, one for an NS record for smtp.server.com and one for and A record for smtp.server.com. When run either the NS or the A record is returned but not both. The one that fails shows the following in the web interface:
Service dns on mc20.lon.server.com is not OK : Service unavailable
*** DNS lookup of 'a:smtp.server.com' *** Timeout (channel destroyed)
In this instance it was the A record that failed but in others it is the NS record. I always get one of the queries back successfully, but not both.
These were working fine until I upgraded to Xymon 4.2.2 so this looks like the culprit. Any ideas or suggestions?
|\/|artin
Martin W. Ward TAC Network Systems Team Leader COLT Unit 12, Powergate Business Park Volt Avenue, Park Royal, London NW10 6PW, United Kingdom
Tel: + 44 (0)20 7863 5218 Internal: 8 441 5218 Fax: + 44 (0)20 7863 5610 Email: martin.ward at colt.net www.colt.net
Data | Voice | Managed Services
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
Hi Martin,
On Mon, Jan 05, 2009 at 01:58:56PM -0000, Ward, Martin wrote:
*** DNS lookup of 'a:smtp.server.com' *** Timeout (channel destroyed)
In this instance it was the A record that failed but in others it is the NS record. I always get one of the queries back successfully, but not both.
These were working fine until I upgraded to Xymon 4.2.2 so this looks like the culprit. Any ideas or suggestions?
there was a change done in 4.2.2 - backported from the 4.3.x code - to fix a bug that could cause the network tests to lockup while doing the DNS lookups. It is probably that "fix" that causes the problem.
Going over the DNS code again, I think there's some flawed logic in how it handles the lookups. Could you try the attached version of xymon-4.2.2/bbnet/dns.c ? Just copy it on top of the existing one, then run "make" and copy the resulting xymon-4.2.2/bbnet/bbtest-net binary to your ~xymon/server/bin/ directory (save the existing one just in case this completely breaks stuff).
Let me know if that is better.
Regards, Henrik
Hi Henrik,
I compiled that in and installed it but it seems to have messed up all the remote port checks. All my ssh port tests, which are initiated from the server, are now purple, as well as the DNS checks, syslog port checks and others besides.
Rebuilding with the previous version has restored the remote port checks as well as the dual-DNS-check errors.
|\/|artin
-----Original Message----- From: Henrik Størner [mailto:henrik at hswn.dk] Sent: 07 January 2009 13:30 To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
Hi Martin,
On Mon, Jan 05, 2009 at 01:58:56PM -0000, Ward, Martin wrote:
*** DNS lookup of 'a:smtp.server.com' *** Timeout (channel destroyed)
In this instance it was the A record that failed but in others it is the NS record. I always get one of the queries back successfully, but not both.
These were working fine until I upgraded to Xymon 4.2.2 so this looks like the culprit. Any ideas or suggestions?
there was a change done in 4.2.2 - backported from the 4.3.x code - to fix a bug that could cause the network tests to lockup while doing the DNS lookups. It is probably that "fix" that causes the problem.
Going over the DNS code again, I think there's some flawed logic in how it handles the lookups. Could you try the attached version of xymon-4.2.2/bbnet/dns.c ? Just copy it on top of the existing one, then run "make" and copy the resulting xymon-4.2.2/bbnet/bbtest-net binary to your ~xymon/server/bin/ directory (save the existing one just in case this completely breaks stuff).
Let me know if that is better.
Regards, Henrik
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
Hi.
We have been experiencing another DNS check problem since the upgrade to Xymon 4.2.2. Since I upgraded, I sometimes get "Timeout (channel destroyed) Seconds: 4.999" on two DNS servers that are on an offsite location (connected over VPN). The problem started immediately after the update, so I think it is related. This never happened with 4.2.0. Has the timeout been changed in the new version? Anyhow, I compiled and installed the new dns.c and have not experienced any "purple" issues. Now I will just wait and see if the DNS check alerts will continue to appear.
/Johan
-----Original Message----- From: Ward, Martin [mailto:Martin.Ward at colt.net] Sent: den 7 januari 2009 16:52 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Hi Henrik,
I compiled that in and installed it but it seems to have messed up all the remote port checks. All my ssh port tests, which are initiated from the server, are now purple, as well as the DNS checks, syslog port checks and others besides.
Rebuilding with the previous version has restored the remote port checks as well as the dual-DNS-check errors.
|\/|artin
-----Original Message----- From: Henrik Størner [mailto:henrik at hswn.dk] Sent: 07 January 2009 13:30 To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
Hi Martin,
On Mon, Jan 05, 2009 at 01:58:56PM -0000, Ward, Martin wrote:
*** DNS lookup of 'a:smtp.server.com' *** Timeout (channel destroyed)
In this instance it was the A record that failed but in others it is the NS record. I always get one of the queries back successfully, but not both.
These were working fine until I upgraded to Xymon 4.2.2 so this looks like the culprit. Any ideas or suggestions?
there was a change done in 4.2.2 - backported from the 4.3.x code - to fix a bug that could cause the network tests to lockup while doing the DNS lookups. It is probably that "fix" that causes the problem.
Going over the DNS code again, I think there's some flawed logic in how it handles the lookups. Could you try the attached version of xymon-4.2.2/bbnet/dns.c ? Just copy it on top of the existing one, then run "make" and copy the resulting xymon-4.2.2/bbnet/bbtest-net binary to your ~xymon/server/bin/ directory (save the existing one just in case this completely breaks stuff).
Let me know if that is better.
Regards, Henrik
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
-----Original Message----- From: Johan Sjöberg [mailto:johan.sjoberg at deltamanagement.se] Sent: den 7 januari 2009 17:01 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Hi.
We have been experiencing another DNS check problem since the upgrade to Xymon 4.2.2. Since I upgraded, I sometimes get "Timeout (channel destroyed) Seconds: 4.999" on two DNS servers that are on an offsite location (connected over VPN). The problem started immediately after the update, so I think it is related. This never happened with 4.2.0. Has the timeout been changed in the new version? Anyhow, I compiled and installed the new dns.c and have not experienced any "purple" issues. Now I will just wait and see if the DNS check alerts will continue to appear.
/Johan
-----Original Message----- From: Ward, Martin [mailto:Martin.Ward at colt.net] Sent: den 7 januari 2009 16:52 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Hi Henrik,
I compiled that in and installed it but it seems to have messed up all the remote port checks. All my ssh port tests, which are initiated from the server, are now purple, as well as the DNS checks, syslog port checks and others besides.
Rebuilding with the previous version has restored the remote port checks as well as the dual-DNS-check errors.
|\/|artin
-----Original Message----- From: Henrik Størner [mailto:henrik at hswn.dk] Sent: 07 January 2009 13:30 To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
Hi Martin,
On Mon, Jan 05, 2009 at 01:58:56PM -0000, Ward, Martin wrote:
*** DNS lookup of 'a:smtp.server.com' *** Timeout (channel destroyed)
In this instance it was the A record that failed but in others it is the NS record. I always get one of the queries back successfully, but not both.
These were working fine until I upgraded to Xymon 4.2.2 so this looks like the culprit. Any ideas or suggestions?
there was a change done in 4.2.2 - backported from the 4.3.x code - to fix a bug that could cause the network tests to lockup while doing the DNS lookups. It is probably that "fix" that causes the problem.
Going over the DNS code again, I think there's some flawed logic in how it handles the lookups. Could you try the attached version of xymon-4.2.2/bbnet/dns.c ? Just copy it on top of the existing one, then run "make" and copy the resulting xymon-4.2.2/bbnet/bbtest-net binary to your ~xymon/server/bin/ directory (save the existing one just in case this completely breaks stuff).
Let me know if that is better.
Regards, Henrik
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
No virus found in this incoming message. Checked by AVG - http://www.avg.com Version: 8.0.176 / Virus Database: 270.10.5/1881 - Release Date: 2009-01-07 17:59
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
Where should that core dump be located? There is no dump located in the server/bin/ directory :(
/Johan
-----Original Message----- From: Lars Ebeling [mailto:lars.ebeling at leopg9.no-ip.org] Sent: den 8 januari 2009 09:24 To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
run find
find / -name "core*" -print
Lars
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:38 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Where should that core dump be located? There is no dump located in the server/bin/ directory :(
/Johan
-----Original Message----- From: Lars Ebeling [mailto:lars.ebeling at leopg9.no-ip.org] Sent: den 8 januari 2009 09:24 To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi.
I think I was able to run gdb the correct way... Here is the output:
gdb ../bin/bbtest-net core
GNU gdb 6.4.90-debian Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i486-linux-gnu"...Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".
warning: Can't read pathname for load map: Input/output error. Reading symbols from /usr/lib/i686/cmov/libssl.so.0.9.8...done. Loaded symbols for /usr/lib/i686/cmov/libssl.so.0.9.8 Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.8...done. Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.8 Reading symbols from /lib/tls/i686/cmov/libc.so.6...done. Loaded symbols for /lib/tls/i686/cmov/libc.so.6 Reading symbols from /lib/tls/i686/cmov/libdl.so.2...done. Loaded symbols for /lib/tls/i686/cmov/libdl.so.2 Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /lib/tls/i686/cmov/libnss_files.so.2...done. Loaded symbols for /lib/tls/i686/cmov/libnss_files.so.2 Core was generated by `bbtest-net --report --ping --checkresponse'. Program terminated with signal 6, Aborted. #0 0xb7efa410 in ?? ()
/Johan
-----Original Message----- From: Lars Ebeling [mailto:lars.ebeling at leopg9.no-ip.org] Sent: den 8 januari 2009 13:18 To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
run find
find / -name "core*" -print
Lars
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:38 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Where should that core dump be located? There is no dump located in the server/bin/ directory :(
/Johan
-----Original Message----- From: Lars Ebeling [mailto:lars.ebeling at leopg9.no-ip.org] Sent: den 8 januari 2009 09:24 To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <lars.ebeling at leopg9.no-ip.org> wrote:
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
This looks remarkably like the error we were experiencing. What are you running on?
Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same.
With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
(Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html )
YMMV.
Cheers V
-----Original Message----- From: doctor at makelofine.org [mailto:doctor at makelofine.org] Sent: Friday, 9 January 2009 3:34 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <lars.ebeling at leopg9.no-ip.org> wrote:
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
On Fri, 9 Jan 2009 15:43:58 +0900, "Everett, Vernon" <Vernon.Everett at woodside.com.au> wrote:
This looks remarkably like the error we were experiencing. What are you running on?
Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same.
With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
(Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html )
YMMV.
Cheers V
-----Original Message----- From: doctor at makelofine.org [mailto:doctor at makelofine.org] Sent: Friday, 9 January 2009 3:34 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <lars.ebeling at leopg9.no-ip.org> wrote:
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
I had purple on all the net tests (ping,http, https...) Your solution to use the --no-ares option is with the recompiled bbtest-net or with the original one ? (in fact , your solution fix the DNS test problem, or the massive purples one ?)
It fixed both problems. It worked with the bbtest-net as it was. --no-ares is a standard option.
-----Original Message----- From: doctor at makelofine.org [mailto:doctor at makelofine.org] Sent: Friday, 9 January 2009 3:59 PM To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
On Fri, 9 Jan 2009 15:43:58 +0900, "Everett, Vernon" <Vernon.Everett at woodside.com.au> wrote:
This looks remarkably like the error we were experiencing. What are you running on?
Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same.
With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
(Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html )
YMMV.
Cheers V
-----Original Message----- From: doctor at makelofine.org [mailto:doctor at makelofine.org] Sent: Friday, 9 January 2009 3:34 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <lars.ebeling at leopg9.no-ip.org> wrote:
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
I had purple on all the net tests (ping,http, https...) Your solution to use the --no-ares option is with the recompiled bbtest-net or with the original one ? (in fact , your solution fix the DNS test problem, or the massive purples one ?)
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
On Fri, 9 Jan 2009 16:16:02 +0900, "Everett, Vernon" <Vernon.Everett at woodside.com.au> wrote:
It fixed both problems. It worked with the bbtest-net as it was. --no-ares is a standard option.
-----Original Message----- From: doctor at makelofine.org [mailto:doctor at makelofine.org] Sent: Friday, 9 January 2009 3:59 PM To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
On Fri, 9 Jan 2009 15:43:58 +0900, "Everett, Vernon" <Vernon.Everett at woodside.com.au> wrote:
This looks remarkably like the error we were experiencing. What are you running on?
Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same.
With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
(Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html )
YMMV.
Cheers V
-----Original Message----- From: doctor at makelofine.org [mailto:doctor at makelofine.org] Sent: Friday, 9 January 2009 3:34 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <lars.ebeling at leopg9.no-ip.org> wrote:
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
I had purple on all the net tests (ping,http, https...) Your solution to use the --no-ares option is with the recompiled bbtest-net or with the original one ? (in fact , your solution fix the DNS test problem, or the massive purples one ?)
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
The use of --no-ares does not change the errors "Timeout (channel destroyed)" when using original bbtest-net binary. When using the corrected one, seems to work. I'll wait to see if the network tests still alive and keep you informed.
We did not experience any "purple" problems, only this single crash of the bbtest-net. But I suppose that several consecutive crashes would have caused the tests to go purple.
/Johan
-----Original Message----- From: Everett, Vernon [mailto:Vernon.Everett at woodside.com.au] Sent: den 9 januari 2009 07:44 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This looks remarkably like the error we were experiencing. What are you running on?
Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same.
With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
(Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html )
YMMV.
Cheers V
-----Original Message----- From: doctor at makelofine.org [mailto:doctor at makelofine.org] Sent: Friday, 9 January 2009 3:34 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <lars.ebeling at leopg9.no-ip.org> wrote:
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Johan,
The "purple problems" are a different view of this same error. Having experienced both I can say that you either get bb-net to crash or you find that all your remote connection tests (where bbtest-net verifies if a remote port is accessible) turn purple. Possibly you may get both at the same time, I haven't noticed that yet.
I can confirm that using the new dns.c code bbtest-net crashes on my Solaris 10 system whether I use the --no-ares option or not. 8-(
|\/|artin
-----Original Message----- From: Johan Sjöberg [mailto:johan.sjoberg at deltamanagement.se] Sent: 09 January 2009 08:49 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
We did not experience any "purple" problems, only this single crash of the bbtest-net. But I suppose that several consecutive crashes would have caused the tests to go purple.
/Johan
-----Original Message----- From: Everett, Vernon [mailto:Vernon.Everett at woodside.com.au] Sent: den 9 januari 2009 07:44 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This looks remarkably like the error we were experiencing. What are you running on?
Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same.
With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
(Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html )
YMMV.
Cheers V
-----Original Message----- From: doctor at makelofine.org [mailto:doctor at makelofine.org] Sent: Friday, 9 January 2009 3:34 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <lars.ebeling at leopg9.no-ip.org> wrote:
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
OK, I found a core file and have gleaned the following (I removed the symbol load messages):
Core was generated by `bbtest-net --report --ping --checkresponse --no-ares'. Program terminated with signal 6, Aborted. #0 0xfec64a27 in _lwp_kill () from /lib/libc.so.1 (gdb) bt #0 0xfec64a27 in _lwp_kill () from /lib/libc.so.1 #1 0xfec621d4 in thr_kill () from /lib/libc.so.1 #2 0xfec111c7 in raise () from /lib/libc.so.1 #3 0xfebf15d9 in abort () from /lib/libc.so.1 #4 0x0806cbea in sigsegv_handler (signum=11) at sig.c:52 #5 0xfec63def in __sighndlr () from /lib/libc.so.1 #6 0xfec5a292 in call_user_handler () from /lib/libc.so.1 #7 <signal handler called> #8 0x0806db51 in strbuf_addtobuffer (buf=0x10, newtext=0x8106960 "@\f\a\bÀX\022\b\b", newlen=135419976) at strfunc.c:100 #9 0x08061783 in dns_detail_callback (arg=0x812ee90, status=16, abuf=0x0, alen=0) at dns2.c:215 #10 0x08070ba0 in end_squery (squery=0x81258c0, status=134700378, abuf=0x0, alen=0) at ares_search.c:185 #11 0x08070c6f in search_callback (arg=0x81258c0, status=134700378, abuf=0x0, alen=0) at ares_search.c:179 #12 0x08072fac in qcallback (arg=0x8106960, status=16, abuf=0x0, alen=0) at ares_query.c:110 #13 0x080728b4 in ares_destroy (channel=0x8125848) at ares_destroy.c:40 #14 0x08060ff3 in dns_test_server (serverip=0x0, hostname=0x810781c "a:mx.colt.net,ns:mx.colt.net", banner=0x8106f60) at dns.c:362 #15 0x08057bf7 in run_nslookup_service (service=0x0) at bbtest-net.c:970 ---Type <return> to continue, or q <return> to quit--- #16 0x0805abf9 in main (argc=5, argv=0x804675c) at bbtest-net.c:2218 (gdb)
|\/|artin
-----Original Message----- From: Ward, Martin [mailto:Martin.Ward at colt.net] Sent: 09 January 2009 10:35 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Johan,
The "purple problems" are a different view of this same error. Having experienced both I can say that you either get bb-net to crash or you find that all your remote connection tests (where bbtest-net verifies if a remote port is accessible) turn purple. Possibly you may get both at the same time, I haven't noticed that yet.
I can confirm that using the new dns.c code bbtest-net crashes on my Solaris 10 system whether I use the --no-ares option or not. 8-(
|\/|artin
-----Original Message----- From: Johan Sjöberg [mailto:johan.sjoberg at deltamanagement.se] Sent: 09 January 2009 08:49 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
We did not experience any "purple" problems, only this single crash of the bbtest-net. But I suppose that several consecutive crashes would have caused the tests to go purple.
/Johan
-----Original Message----- From: Everett, Vernon [mailto:Vernon.Everett at woodside.com.au] Sent: den 9 januari 2009 07:44 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This looks remarkably like the error we were experiencing. What are you running on?
Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same.
With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
(Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html )
YMMV.
Cheers V
-----Original Message----- From: doctor at makelofine.org [mailto:doctor at makelofine.org] Sent: Friday, 9 January 2009 3:34 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <lars.ebeling at leopg9.no-ip.org> wrote:
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
Well I have poked and prodded at the code for the last few hours but to no avail. I can't actually see why strbuf_addtobuffer() is shown as having rubbish passed to it in the newtext variable when it is being called from dns2.c with a static text string: "Undocumented ARES return code\n"
Also I am unsure why it's even reaching this part of the code when I specified --no-ares on the command line.
Any ideas?
|\/|artin
-----Original Message----- From: Ward, Martin [mailto:Martin.Ward at colt.net] Sent: 09 January 2009 11:48 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
OK, I found a core file and have gleaned the following (I removed the symbol load messages):
Core was generated by `bbtest-net --report --ping --checkresponse --no-ares'. Program terminated with signal 6, Aborted. #0 0xfec64a27 in _lwp_kill () from /lib/libc.so.1 (gdb) bt #0 0xfec64a27 in _lwp_kill () from /lib/libc.so.1 #1 0xfec621d4 in thr_kill () from /lib/libc.so.1 #2 0xfec111c7 in raise () from /lib/libc.so.1 #3 0xfebf15d9 in abort () from /lib/libc.so.1 #4 0x0806cbea in sigsegv_handler (signum=11) at sig.c:52 #5 0xfec63def in __sighndlr () from /lib/libc.so.1 #6 0xfec5a292 in call_user_handler () from /lib/libc.so.1 #7 <signal handler called> #8 0x0806db51 in strbuf_addtobuffer (buf=0x10, newtext=0x8106960 "@\f\a\bÀX\022\b\b", newlen=135419976) at strfunc.c:100 #9 0x08061783 in dns_detail_callback (arg=0x812ee90, status=16, abuf=0x0, alen=0) at dns2.c:215 #10 0x08070ba0 in end_squery (squery=0x81258c0, status=134700378, abuf=0x0, alen=0) at ares_search.c:185 #11 0x08070c6f in search_callback (arg=0x81258c0, status=134700378, abuf=0x0, alen=0) at ares_search.c:179 #12 0x08072fac in qcallback (arg=0x8106960, status=16, abuf=0x0, alen=0) at ares_query.c:110 #13 0x080728b4 in ares_destroy (channel=0x8125848) at ares_destroy.c:40 #14 0x08060ff3 in dns_test_server (serverip=0x0, hostname=0x810781c "a:mx.colt.net,ns:mx.colt.net", banner=0x8106f60) at dns.c:362 #15 0x08057bf7 in run_nslookup_service (service=0x0) at bbtest-net.c:970 ---Type <return> to continue, or q <return> to quit--- #16 0x0805abf9 in main (argc=5, argv=0x804675c) at bbtest-net.c:2218 (gdb)
|\/|artin
-----Original Message----- From: Ward, Martin [mailto:Martin.Ward at colt.net] Sent: 09 January 2009 10:35 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
Johan,
The "purple problems" are a different view of this same error. Having experienced both I can say that you either get bb-net to crash or you find that all your remote connection tests (where bbtest-net verifies if a remote port is accessible) turn purple. Possibly you may get both at the same time, I haven't noticed that yet.
I can confirm that using the new dns.c code bbtest-net crashes on my Solaris 10 system whether I use the --no-ares option or not. 8-(
|\/|artin
-----Original Message----- From: Johan Sjöberg [mailto:johan.sjoberg at deltamanagement.se] Sent: 09 January 2009 08:49 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
We did not experience any "purple" problems, only this single crash of the bbtest-net. But I suppose that several consecutive crashes would have caused the tests to go purple.
/Johan
-----Original Message----- From: Everett, Vernon [mailto:Vernon.Everett at woodside.com.au] Sent: den 9 januari 2009 07:44 To: hobbit at hswn.dk Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This looks remarkably like the error we were experiencing. What are you running on?
Do you have the case where all the network tests - ping, http, https, ftp etc. go purple? If so it might be the same.
With Henrik's assistance, we resolved it down to a problem with the ARES resolver. I changed our hobbitlaunch.cfg file to include the --no-ares option (see below), and we have never had the issue again. [bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --no-ares LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
(Of course this does come with a caveat. See bbtest-net man page in Xymon docs http://www.xymon.com/hobbit/help/manpages/man1/bbtest-net.1.html )
YMMV.
Cheers V
-----Original Message----- From: doctor at makelofine.org [mailto:doctor at makelofine.org] Sent: Friday, 9 January 2009 3:34 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
On Thu, 8 Jan 2009 09:24:29 +0100, "Lars Ebeling" <lars.ebeling at leopg9.no-ip.org> wrote:
----- Original Message ----- From: "Johan Sjöberg" <johan.sjoberg at deltamanagement.se> To: <hobbit at hswn.dk> Sent: Thursday, January 08, 2009 9:04 AM Subject: RE: [hobbit] Two DNS lookups for a server but one fails
This night, after installing the new bbtest-net, we received an alarm on bbtest for the Xymon server, saying " - Program crashed Fatal signal caught!"
From hobbitlaunch.log: "2009-01-08 05:05:07 Task bbnet terminated by signal 6"
/Johan
If the program crash, you should have a coredump. Run gdb on the coredump
and post the result here, for Henrik to look at.
-- Regards Lars Ebeling
http://leopg9.no-ip.org Hobbithobbyist
"I am not young enough to know everything." -- Oscar Wilde
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi,
Same issue on my side, i used the new compiled bbtest-net using corrected dns.c All is OK, but i got same issue with same error message.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
OK, it seems that the bugfix I added in Xymon 4.2.2 had some nasty side-effects.
I have started work on a 4.2.3 maintenance tree in Subversion, and there is a new version of the DNS code in it currently. http://hobbitmon.svn.sourceforge.net/viewvc/hobbitmon/branches/4.2.3/
You can try out this code (download it from the link above), but it also involves a change from C-ARES 1.2.1 -> 1.6.0, so you will have to re-run the configure script to perhaps pick up a new runtime library that the new C-ARES requires (librt).
I ran it for most of Friday afternoon at work with no obvious bad effects, so I hope it will work better than the current code.
In <1F7B01020EC4D04DA17703634B9E888E09BFE6DF at ULPGCTMVMAI003.EU.COLT> "Ward, Martin" <Martin.Ward at colt.net> writes:
Also I am unsure why it's even reaching this part of the code when I specif= ied --no-ares on the command line.
Xymon still uses ARES to perform the "dns" tests for specific hosts; the standard resolver library does not allow you to specify what DNS server to query. So the --no-ares option only has effect on the DNS lookups Hobbit performs to determine the IP of the hosts it is testing, it does not affect the specific testing of a DNS server.
Regards, Henrik
Thanks for that Henrik.
I took a copy of my existing 4.2.2 code, overwrote the bbnet directory with the SVN source and recreated the c-ares subdirectory in the bbnet subdir by untar-ing c-areas.1.6.0.tar.gz and renaming the c-ares.1.6.0/ subdirectory to c-ares/
Having done this the compilation failed as it couldn't find a library for the clock_gettime() function. I found that adding "-lrt" to the "LDAPLIBS" variable in the Makefile in the top level source directory solved the problem, but this may not be the right place to put it. A little more fiddling was required because my SSL libraries are not located in the standard library locations and I hate having to set up LD_LIBRARY_PATH everywhere, so my LDAPLIBS and SSLLIBS ended up looking like:
SSLLIBS = -L/usr/local/ssl/lib -R/usr/local/ssl/lib -lssl -lcrypto LDAPLIBS = -L/usr/lib -lldap -lrt
After compiling it successfully I copied bbtest-net into the server/bin directory and restarted xymon. Apart from a timeout issue with a local script, which I believe is a config issue on my part, things look like they are working fine now. The double DNS lookups are working OK and I am getting remote port connection tests working so that's good or me.
Thanks Henrik!
|\/|artin
-----Original Message----- From: Henrik "Størner [mailto:henrik at hswn.dk] Sent: 12 January 2009 11:50 To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
OK, it seems that the bugfix I added in Xymon 4.2.2 had some nasty side-effects.
I have started work on a 4.2.3 maintenance tree in Subversion, and there is a new version of the DNS code in it currently. http://hobbitmon.svn.sourceforge.net/viewvc/hobbitmon/branches/4.2.3/
You can try out this code (download it from the link above), but it also involves a change from C-ARES 1.2.1 -> 1.6.0, so you will have to re-run the configure script to perhaps pick up a new runtime library that the new C-ARES requires (librt).
I ran it for most of Friday afternoon at work with no obvious bad effects, so I hope it will work better than the current code.
In <1F7B01020EC4D04DA17703634B9E888E09BFE6DF at ULPGCTMVMAI003.EU.CO LT> "Ward, Martin" <Martin.Ward at colt.net> writes:
Also I am unsure why it's even reaching this part of the code when I specif= ied --no-ares on the command line.
Xymon still uses ARES to perform the "dns" tests for specific hosts; the standard resolver library does not allow you to specify what DNS server to query. So the --no-ares option only has effect on the DNS lookups Hobbit performs to determine the IP of the hosts it is testing, it does not affect the specific testing of a DNS server.
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
The message is intended for the named addressee only and may not be disclosed to or used by anyone else, nor may it be copied in any way.
The contents of this message and its attachments are confidential and may also be subject to legal privilege. If you are not the named addressee and/or have received this message in error, please advise us by e-mailing security at colt.net and delete the message and any attachments without retaining any copies.
Internet communications are not secure and COLT does not accept responsibility for this message, its contents nor responsibility for any viruses.
No contracts can be created or varied on behalf of COLT Telecommunications, its subsidiaries or affiliates ("COLT") and any other party by email Communications unless expressly agreed in writing with such other party.
Please note that incoming emails will be automatically scanned to eliminate potential viruses and unsolicited promotional emails. For more information refer to www.colt.net or contact us on +44(0)20 7390 3900.
Is it possible to use the "badTEST" syntax for DNS tests? When I try it, the DNS test stops updating in Xymon.
/Johan
In <DB2B837627295643A7BC1627A2B4DD9D129C55 at dmwin01.ad.deltamanagement.se> =?iso-8859-1?Q?Johan_Sj=F6berg?= <johan.sjoberg at deltamanagement.se> writes:
Is it possible to use the "badTEST" syntax for DNS tests? When I try it, = the DNS test stops updating in Xymon.
My immediate response would be "yes", but I haven't checked the code. Should work, though.
What does your bb-hosts entry look like ?
Regards, Henrik
It looked like this during the tests: 10.225.72.5 xxx.xxx.xxx # baddns:1:2:4 devmon:model(compaq;server)
With this entry, the dns test stopped updating.
/Johan
-----Original Message----- From: Henrik "Størner [mailto:henrik at hswn.dk] Sent: den 22 januari 2009 15:31 To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
In <DB2B837627295643A7BC1627A2B4DD9D129C55 at dmwin01.ad.deltamanagement.se> =?iso-8859-1?Q?Johan_Sj=F6berg?= <johan.sjoberg at deltamanagement.se> writes:
Is it possible to use the "badTEST" syntax for DNS tests? When I try it, = the DNS test stops updating in Xymon.
My immediate response would be "yes", but I haven't checked the code. Should work, though.
What does your bb-hosts entry look like ?
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
In <DB2B837627295643A7BC1627A2B4DD9D129CB0 at dmwin01.ad.deltamanagement.se> =?iso-8859-1?Q?Johan_Sj=F6berg?= <johan.sjoberg at deltamanagement.se> writes:
It looked like this during the tests: 10.225.72.5 xxx.xxx.xxx # baddns:1:2:4 devmon:model(compaq;server)
Ah I see. The "baddns" only tells how to handle failures of the DNS check. You still need the "dns" to enable DNS checking at all! So your entry should have been
10.225.72.5 xx.xxx.xxx # baddns:1:2:4 dns devmon:model(compaq;server)
Regards, Henrik
Ah, thanks, I will try that.
/Johan
-----Original Message----- From: Henrik "Størner [mailto:henrik at hswn.dk] Sent: den 22 januari 2009 16:02 To: hobbit at hswn.dk Subject: Re: [hobbit] Two DNS lookups for a server but one fails
In <DB2B837627295643A7BC1627A2B4DD9D129CB0 at dmwin01.ad.deltamanagement.se> =?iso-8859-1?Q?Johan_Sj=F6berg?= <johan.sjoberg at deltamanagement.se> writes:
It looked like this during the tests: 10.225.72.5 xxx.xxx.xxx # baddns:1:2:4 devmon:model(compaq;server)
Ah I see. The "baddns" only tells how to handle failures of the DNS check. You still need the "dns" to enable DNS checking at all! So your entry should have been
10.225.72.5 xx.xxx.xxx # baddns:1:2:4 dns devmon:model(compaq;server)
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
participants (6)
-
doctor@makelofine.org
-
henrik@hswn.dk
-
johan.sjoberg@deltamanagement.se
-
lars.ebeling@leopg9.no-ip.org
-
Martin.Ward@colt.net
-
Vernon.Everett@woodside.com.au