I've been going over the hobbitfetch code looking for the cause of the cpu-spinning lockups that have been reported. I *think* I've found it, but this needs confirmation from someone who sees the problem in a real setup, as opposed to my testing scenario.
So I'd be interested to have my bugfix tested by someone who has this problem. If you can, grab the current snapshot http://www.hswn.dk/beta/ Rebuild Hobbit, copy for snapshot/hobbitd/hobbitfetch utility into your ~hobbit/server/bin/ directory and restart Hobbit. Hopefull the problem should be gone.
Regards, Henrik
On Tue, 2007-07-24 at 14:36 +0200, Henrik Stoerner wrote:
I've been going over the hobbitfetch code looking for the cause of the cpu-spinning lockups that have been reported. I *think* I've found it, but this needs confirmation from someone who sees the problem in a real setup, as opposed to my testing scenario.
So I'd be interested to have my bugfix tested by someone who has this problem. If you can, grab the current snapshot http://www.hswn.dk/beta/ Rebuild Hobbit, copy for snapshot/hobbitd/hobbitfetch utility into your ~hobbit/server/bin/ directory and restart Hobbit.
Ok, I've done that. We'll keep a close eye on it today to see if it goes berserk...
Hopefull the problem should be gone.
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
On Tue, Jul 24, 2007 at 08:04:27AM -0500, Daniel J McDonald wrote:
On Tue, 2007-07-24 at 14:36 +0200, Henrik Stoerner wrote:
I've been going over the hobbitfetch code looking for the cause of the cpu-spinning lockups that have been reported. I *think* I've found it, but this needs confirmation from someone who sees the problem in a real setup, as opposed to my testing scenario.
So I'd be interested to have my bugfix tested by someone who has this problem. If you can, grab the current snapshot http://www.hswn.dk/beta/ Rebuild Hobbit, copy for snapshot/hobbitd/hobbitfetch utility into your ~hobbit/server/bin/ directory and restart Hobbit.
Ok, I've done that. We'll keep a close eye on it today to see if it goes berserk...
Please re-get it. I'd forgotten to update the snapshot file with the last fix I made, so there was about 10 minutes with the wrong version on the web.
The right one will have "$Id: hobbitfetch.c,v 1.18 2007/07/24 13:00:29 henrik Exp $"; near the top of the hobbitfetch.c file.
Sorry about that.
Henrik
On Tue, 2007-07-24 at 15:05 +0200, Henrik Stoerner wrote:
On Tue, Jul 24, 2007 at 08:04:27AM -0500, Daniel J McDonald wrote:
On Tue, 2007-07-24 at 14:36 +0200, Henrik Stoerner wrote:
I've been going over the hobbitfetch code looking for the cause of the cpu-spinning lockups that have been reported. I *think* I've found it, but this needs confirmation from someone who sees the problem in a real setup, as opposed to my testing scenario.
So I'd be interested to have my bugfix tested by someone who has this problem. If you can, grab the current snapshot http://www.hswn.dk/beta/ Rebuild Hobbit, copy for snapshot/hobbitd/hobbitfetch utility into your ~hobbit/server/bin/ directory and restart Hobbit.
Ok, I've done that. We'll keep a close eye on it today to see if it goes berserk...
Please re-get it. I'd forgotten to update the snapshot file with the last fix I made, so there was about 10 minutes with the wrong version on the web.
The right one will have "$Id: hobbitfetch.c,v 1.18 2007/07/24 13:00:29 henrik Exp $"; near the top of the hobbitfetch.c file.
static char rcsid[] = "$Id: hobbitfetch.c,v 1.18 2007/07/24 13:00:29 henrik Exp $";
It's installed now....
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
On Tue, 2007-07-24 at 15:05 +0200, Henrik Stoerner wrote:
On Tue, Jul 24, 2007 at 08:04:27AM -0500, Daniel J McDonald wrote:
On Tue, 2007-07-24 at 14:36 +0200, Henrik Stoerner wrote:
I've been going over the hobbitfetch code looking for the cause of the cpu-spinning lockups that have been reported. I *think* I've found it, but this needs confirmation from someone who sees the problem in a real setup, as opposed to my testing scenario.
So I'd be interested to have my bugfix tested by someone who has this problem. If you can, grab the current snapshot http://www.hswn.dk/beta/ Rebuild Hobbit, copy for snapshot/hobbitd/hobbitfetch utility into your ~hobbit/server/bin/ directory and restart Hobbit.
Ok, I've done that. We'll keep a close eye on it today to see if it goes berserk...
In the past 22 hours it has not gotten into a loop, but I still see this status on the hobbitfetch item:
- Program crashed
Fatal signal caught!
Status unchanged in 2 hours, 16 minutes
Status message received from 127.0.0.1
Client data available
If you would like any log files are debug items, I will be happy to provide them. I will be in meetings most of the day starting in about 2-1/2 hours.
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
On Wed, Jul 25, 2007 at 06:58:40AM -0500, McDonald, Dan wrote:
In the past 22 hours it has not gotten into a loop,
Good, I suppose the client data is being fetched and updates correctly?
but I still see this status on the hobbitfetch item:
- Program crashed Fatal signal caught!
I'd like to know if this generates a core file in ~hobbit/server/tmp/ (it should), and if there is one then please run it through gdb to get the backtrace as described in http://www.hswn.dk/hobbit/help/known-issues.html#bugreport
Thanks, Henrik
On Wed, 2007-07-25 at 14:44 +0200, Henrik Stoerner wrote:
On Wed, Jul 25, 2007 at 06:58:40AM -0500, McDonald, Dan wrote:
In the past 22 hours it has not gotten into a loop,
Good, I suppose the client data is being fetched and updates correctly?
but I still see this status on the hobbitfetch item:
- Program crashed Fatal signal caught!I'd like to know if this generates a core file in ~hobbit/server/tmp/ (it should), and if there is one then please run it through gdb to get the backtrace as described in http://www.hswn.dk/hobbit/help/known-issues.html#bugreport
This appears to be the appropriate core:
[mcdonalddj at ldap ~]$ sudo su hobbit - bash-3.00$ cd /usr/lib/hobbit bash-3.00$ cd server bash-3.00$ gdb bin/hobbitfetch tmp/core GNU gdb 6.3-5mdk (Mandriva Linux release 2006.0) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i586-mandriva-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1".
Reading symbols from shared object read from target memory...done. Loaded system supplied DSO at 0xffffe000 Core was generated by `/usr/lib/hobbit/server/bin/hobbitfetch --server=127.0.0.1 --no-daemon --pidfile'. Program terminated with signal 6, Aborted.
warning: svr4_current_sos: Can't read pathname for load map: Input/output error
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0 0xffffe410 in __kernel_vsyscall ()
(gdb) bt
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7de9051 in raise () from /lib/tls/libc.so.6
#2 0xb7deaa3b in abort () from /lib/tls/libc.so.6
#3 0x0804fbff in sigsegv_handler (signum=11) at sig.c:57
#4 <signal handler called>
#5 main (argc=4, argv=0xbff11094) at hobbitfetch.c:746
(gdb) quit
bash-3.00$ ls -l tmp
lrwxrwxrwx 1 root root 19 Nov 27 2006 tmp -> /var/lib/hobbit/tmp
bash-3.00$ ls -l tmp/
total 6814
-rw-r--r-- 1 hobbit hobbit 92921 Jul 25 07:56 alert.chk
-rw-r--r-- 1 hobbit hobbit 1785 Jul 25 07:56 alert.chk.sub
-rw------- 1 hobbit hobbit 700416 Jul 25 07:45 core
-rw------- 1 hobbit hobbit 602112 Dec 11 2006 core.11005
-rw------- 1 hobbit hobbit 630784 Dec 8 2006 core.12024
-rw------- 1 hobbit hobbit 618496 Dec 12 2006 core.19573
-rw------- 1 hobbit hobbit 585728 Dec 27 2006 core.20895
-rw------- 1 hobbit hobbit 454656 Jan 23 2007 core.23799
-rw------- 1 hobbit hobbit 569344 May 22 01:40 core.24289
-rw------- 1 hobbit hobbit 860160 Feb 7 09:20 core.26081
-rw------- 1 hobbit hobbit 430080 Jun 26 21:55 core.30855
-rw------- 1 hobbit hobbit 577536 Jun 13 19:05 core.6773
-rw----r-- 1 hobbit hobbit 450560 Dec 4 2006 core.7.21
-rw------- 1 hobbit hobbit 573440 Feb 16 02:20 core.7213
-rw------- 1 hobbit hobbit 1361898 Jul 25 07:58 hobbitd.chk
-rw------- 1 hobbit hobbit 176 Jul 25 07:59 ping..status
lrwxrwxrwx 1 root root 30 Nov 27 2006 tmp
-> ../../../../var/lib/hobbit/tmp
bash-3.00$
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
participants (2)
-
dan.mcdonald@austinenergy.com
-
henrik@hswn.dk