Could you try removing the "HEARTBEAT" line from hobbitlaunch.cfg and see if things run OK after that ?
Regards, Henrik
On Wed, May 11, 2005 at 07:41:37AM +0000, Stefan Loos wrote:
Hi,
yesterday I add some new hosts to my hobbit-server and short after that hobbit had some problems. Here is what hobbitlauch.log says:
2005-05-11 09:11:18 Heartbeat lost for task hobbitd, bouncing it 2005-05-11 09:11:18 Task bbretest started with PID 4523 2005-05-11 09:11:23 Heartbeat lost for task hobbitd, killing it 2005-05-11 09:11:23 Task bbdisplay started with PID 4524 2005-05-11 09:11:23 Task hobbitd terminated by signal 9 2005-05-11 09:11:23 Task hobbitd started with PID 4525 2005-05-11 09:11:23 Loading hostnames 2005-05-11 09:11:23 Loading saved state 2005-05-11 09:11:23 Setting up network listener on 0.0.0.0:1984 2005-05-11 09:11:23 Setting up signal handlers 2005-05-11 09:11:23 Setting up hobbitd channels 2005-05-11 09:11:23 Setting up logfiles 2005-05-11 09:11:28 Task bbhistory started with PID 4527 2005-05-11 09:11:28 Task bbenadis started with PID 4528 2005-05-11 09:11:28 Task bbpage started with PID 4530 2005-05-11 09:11:28 Task larrdstatus started with PID 4532 2005-05-11 09:11:28 Task larrddata started with PID 4534 2005-05-11 09:12:18 Task bbretest started with PID 4541 2005-05-11 09:12:23 Task bbdisplay started with PID 4542 2005-05-11 09:12:43 Heartbeat lost for task hobbitd, bouncing it 2005-05-11 09:12:48 Heartbeat lost for task hobbitd, killing it 2005-05-11 09:12:48 Task hobbitd terminated by signal 9 2005-05-11 09:12:48 Task bbdisplay terminated by signal 15
So I tried to find out which component causes the problem and disabled everything in hobbitlauch.cfg and reenabled one by one. I found out that everytime I enabled bbdisplay those errors occour. The bb-display.log looks like this:
2005-05-11 09:09:48 Whoops ! bb failed to send message - timeout 2005-05-11 09:09:48 hobbitd status-board not available 2005-05-11 09:09:53 Whoops ! bb failed to send message - timeout 2005-05-11 09:10:53 Whoops ! bb failed to send message - timeout 2005-05-11 09:10:53 hobbitd status-board not available 2005-05-11 09:11:23 Could not connect to bbd at 10.207.193.41:1984 - Connection refused 2005-05-11 09:11:23 Whoops ! bb failed to send message - Connection failed 2005-05-11 09:11:23 hobbitd status-board not available 2005-05-11 09:11:23 Could not connect to bbd at 10.207.193.41:1984 - Connection refused 2005-05-11 09:11:23 Whoops ! bb failed to send message - Connection failed
I also found some core files in ~server/tmp but I'm pretty shure they came from killing hobbit - nevertheless I've run the gdb util:
GNU gdb Red Hat Linux (6.1post-1.20040607.52rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1".
Core was generated by `hobbitd --debug --pidfile=/var/log/hobbit/hobbitd.pid --restart=/usr/local/hobb'. Program terminated with signal 6, Aborted. Reading symbols from /lib/tls/libc.so.6...done. Loaded symbols for /lib/tls/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 #0 0x00df4cef in raise () from /lib/tls/libc.so.6 (gdb) bt #0 0x00df4cef in raise () from /lib/tls/libc.so.6 #1 0x00df64f5 in abort () from /lib/tls/libc.so.6 #2 0x08054126 in sigsegv_handler (signum=11) at sig.c:57 #3 <signal handler called> #4 0x00e46cac in mempcpy () from /lib/tls/libc.so.6 #5 0x00e3a4d2 in _IO_default_xsputn_internal () from /lib/tls/libc.so.6 #6 0x00e13527 in vfprintf () from /lib/tls/libc.so.6 #7 0x00e2f3dc in vsprintf () from /lib/tls/libc.so.6 #8 0x00e1a03d in sprintf () from /lib/tls/libc.so.6 #9 0x0804d7a4 in do_message (msg=0x9e0b3f8, origin=0x80554bb "") at hobbitd.c:1903 #10 0x0804fcb5 in main (argc=8, argv=0xbfff9084) at hobbitd.c:2944 (gdb)
Now I try to find out which of the new hosts - and what test causes the problems...
Regards,
Stefan Loos
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Henrik Storner