Morning hobbiters
I am having an issue with hobbit 4.2.0 built on a RedHat 9.0 system.
Basically, after running for about 15 hours or so, the bbtest-net command hangs, and all network tests turn purple. Here is the stanza from the hobbitlaunch.cfg file with regards the bbtest-net
[bbnet] ENVFILE /BIG/usr/local/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse --debug LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
The last thing in the logs is the retrieval of the DNS for all the
hosts and then ... nothing. i.e
2006-09-02 17:02:11 Got DNS result for host narya.servista.com : 192.168.1.32 2006-09-02 17:02:11 Got DNS result for host skye.int.servista.com : 192.168.1.452006-09-02 17:02:11 Got DNS result for host oban.int.servista.com : 192.168.1.722006-09-02 17:02:11 Got DNS result for host orkney.servista.com : 192.168.1.30 2006-09-02 17:02:11 Got DNS result for host islay.servista.com : 192.168.1.28 2006-09-02 17:02:11 Got DNS result for host tennessee.int.servista.com : 192.168.1.109
Has anyone had any experience of this kind of issue before? Is there
any way I can get some more logging to see what is happening?
Cheers
Iain Conochie UNIX Systems Administrator COLT Telecommunications PLC
On Mon, Sep 04, 2006 at 09:16:34AM +0100, Iain M Conochie wrote:
I am having an issue with hobbit 4.2.0 built on a RedHat 9.0 system. Basically, after running for about 15 hours or so, the bbtest-net command hangs, and all network tests turn purple.
That's obviously bad. I suppose you killed off the bbtest-net process to get things running again. If it happens again, could you kill it with kill -6 <bbtest-net PID> This causes it to dump a core-file in the ~hobbit/server/tmp/ directory which should help me track down where it is hanging.
Did you notice if the process was using a lot of cpu time, or if it was just completely stalled? Was there an "fping" or "hobbitping" process hanging around also?
Regards, Henrik
On Mon, 4 Sep 2006, Henrik Stoerner wrote:
On Mon, Sep 04, 2006 at 09:16:34AM +0100, Iain M Conochie wrote:
I am having an issue with hobbit 4.2.0 built on a RedHat 9.0 system. Basically, after running for about 15 hours or so, the bbtest-net command hangs, and all network tests turn purple.
That's obviously bad. I suppose you killed off the bbtest-net process to get things running again. If it happens again, could you kill it with kill -6 <bbtest-net PID> This causes it to dump a core-file in the ~hobbit/server/tmp/ directory which should help me track down where it is hanging.
OK I will try that. Basically what i was doing was restarting the whole hobbit server. The next tiem we have this issue i will try the kill -6 command to get the core dump.
Did you notice if the process was using a lot of cpu time, or if it was just completely stalled? Was there an "fping" or "hobbitping" process hanging around also?
Basically, the bbtest-net program stalled and the process was hainging around. CPU usage was normal. I am using the hobbitping command
Cheers
Iain
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
This sounds somewhat like the problem that I reported a few days ago. Except in my case I could not get access to the machine to check any processes, it was hung tight. The hypervisor does indicate that it was using alot of CPU time, which would suggest a loop.
Henrik Stoerner wrote:
On Mon, Sep 04, 2006 at 09:16:34AM +0100, Iain M Conochie wrote:
I am having an issue with hobbit 4.2.0 built on a RedHat 9.0 system. Basically, after running for about 15 hours or so, the bbtest-net command hangs, and all network tests turn purple.
That's obviously bad. I suppose you killed off the bbtest-net process to get things running again. If it happens again, could you kill it with kill -6 <bbtest-net PID> This causes it to dump a core-file in the ~hobbit/server/tmp/ directory which should help me track down where it is hanging.
Did you notice if the process was using a lot of cpu time, or if it was just completely stalled? Was there an "fping" or "hobbitping" process hanging around also?
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Rich Smrcina VM Assist, Inc. Phone: 414-491-6001 Ans Service: 360-715-2467 rich.smrcina at vmassist.com
Catch the WAVV! http://www.wavv.org WAVV 2007 - Green Bay, WI - May 18-22, 2007
On Mon, 2006-09-04 at 11:11 +0200, Henrik Stoerner wrote:
On Mon, Sep 04, 2006 at 09:16:34AM +0100, Iain M Conochie wrote:
I am having an issue with hobbit 4.2.0 built on a RedHat 9.0 system. Basically, after running for about 15 hours or so, the bbtest-net command hangs, and all network tests turn purple.
That's obviously bad. I suppose you killed off the bbtest-net process to get things running again. If it happens again, could you kill it with kill -6 <bbtest-net PID> This causes it to dump a core-file in the ~hobbit/server/tmp/ directory which should help me track down where it is hanging.
OK - got that. How can I analyse this to get the information you need?
Did you notice if the process was using a lot of cpu time, or if it was just completely stalled? Was there an "fping" or "hobbitping" process hanging around also?
Nope - none at all.
Cheers
Iain
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
participants (3)
-
henrik@hswn.dk
-
iain@shihad.org
-
rsmrcina@wi.rr.com