Hi Martin,
Although I'm not sure how to trace this back to a particular host via logs; in my experience the #trace option in bb-hosts has a high chance of causing it. I ended up limiting the amount of hops it could make, and the errors then went. If this is the cause of your problem, you could at least narrow down the hosts by whether or not they are responsive to pings and whether they are set to trace on failure. The default TTL (number of hops) for traceroute is 30 (on my distribution), I limited it to 10 in hobbitserver.cfg: TRACEROUTE="traceroute -m10"
Regards, Phil
2009/4/16 Martin Flemming <martin.flemming at desy.de>
Hi !
I've got besides warning-mail by xymon ..
bbtest warning (YELLOW)
Error output: Timeout waiting for data from child, killing it Timeout waiting for data from child, killing it Child process terminated with signal 15
Unfortunately i didn't found the host who makes this error ... Can somebody give me an hint how i detect this error-host ?
.. in hobbitlaunch.cfg, i've added to the bbnet-test
CMD bbtest-net --report --ping --checkresponse --dnslog=/usr/lib/hobbit/serverlogs/dns.log --debug
nothing appears in dns.log and in bb-network.log appears to much .. :-(
So, which things are important to detect the error-host apart error and/or failed in the logfile ?
Is there another possiblity to debug this problem ?
Thanks in advance & cheers
martin---------- Forwarded message ---------- Date: Fri, 10 Apr 2009 15:23:53 +0200 (CEST) From: Hobbit user <hobbit at mail.desy.de> To: hobbit-patrol at desy.de Subject: Hobbit [94627] it-wgs02:bbtest warning (YELLOW)
yellow Fri Apr 10 15:21:13 2009
bbtest-net version 4.3.0-0.20071026 SSL library : OpenSSL 0.9.7a Feb 19 2003 LDAP library: OpenLDAP 20213
Statistics: Hosts total : 1791 Hosts with no tests : 730 Total test count : 1324 Status messages : 1327 Alert status msgs : 0 Transmissions : 14
DNS statistics:
hostnames resolved : 1063
succesful : 1063
failed : 0
calls to dnsresolve : 1322
TCP test statistics:
TCP tests total : 150
HTTP tests : 7
Simple TCP tests : 143
Connection attempts : 150
bytes written : 1740
bytes read : 23223
Error output: Timeout waiting for data from child, killing it Timeout waiting for data from child, killing it Child process terminated with signal 15
TIME SPENT Event Starttime Duration bbtest-net startup 1239369673.473591
Service definitions loaded 1239369673.488615 0.015024 Tests loaded 1239369688.393655 14.905040 DNS lookups completed 1239369688.410264 0.016609 Test engine setup completed 1239369688.429766 0.019502 TCP tests completed 1239369699.001287 10.571521 PING test completed (1061 hosts) 1239369737.658479 38.657192 PING test results sent 1239369767.700406 30.041927 Test result collection completed 1239369767.703455 0.003049 LDAP test engine setup completed 1239369767.703457 0.000002 LDAP tests executed 1239369767.703459 0.000002 LDAP tests result collection completed 1239369767.703461 0.000002 NTP tests executed 1239369767.814258 0.110797 RPC tests executed 1239369769.447013 1.632755 Test results transmitted 1239369769.547149 0.100136 bbtest-net completed 1239369769.552037 0.004888 TIME TOTAL 96.078446
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk