[hobbit] Bug latest snapshot, hobbitd_client
[hobbit at hobbit2 server]$ file tmp/core.21567 tmp/core.21567: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'hobbitd_client'
[hobbit at hobbit2 server]$ file tmp/core.21600 tmp/core.21600: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'hobbitd_client'
[hobbit at hobbit2 server]$ ls -al tmp/core.21567 tmp/core.21600 -rw------- 1 hobbit hobbit 5210112 Mar 19 00:23 tmp/core.21567 -rw------- 1 hobbit hobbit 5210112 Mar 19 00:23 tmp/core.21600
Dumps core in pairs every 1-5 minutes or so:
-rw------- 1 hobbit hobbit 5210112 Mar 19 00:23 tmp/core.21600 -rw------- 1 hobbit hobbit 5210112 Mar 19 00:23 tmp/core.21567 -rw------- 1 hobbit hobbit 4038656 Mar 19 00:27 tmp/core.21841 -rw------- 1 hobbit hobbit 4038656 Mar 19 00:27 tmp/core.21602 -rw------- 1 hobbit hobbit 49213440 Mar 19 00:31 tmp/core.21520 -rw------- 1 hobbit hobbit 5505024 Mar 19 00:32 tmp/core.22115 -rw------- 1 hobbit hobbit 5505024 Mar 19 00:32 tmp/core.22109 -rw------- 1 hobbit hobbit 4227072 Mar 19 00:36 tmp/core.22378 -rw------- 1 hobbit hobbit 4227072 Mar 19 00:36 tmp/core.22169 -rw------- 1 hobbit hobbit 3776512 Mar 19 00:38 tmp/core.22439 -rw------- 1 hobbit hobbit 3776512 Mar 19 00:38 tmp/core.22379 -rw------- 1 hobbit hobbit 5881856 Mar 19 00:43 tmp/core.22706 -rw------- 1 hobbit hobbit 5881856 Mar 19 00:43 tmp/core.22441 -rw------- 1 hobbit hobbit 3584000 Mar 19 00:44 tmp/core.22715 -rw------- 1 hobbit hobbit 3584000 Mar 19 00:44 tmp/core.22707 -rw------- 1 hobbit hobbit 4902912 Mar 19 00:48 tmp/core.22968 -rw------- 1 hobbit hobbit 4902912 Mar 19 00:48 tmp/core.22716 -rw------- 1 hobbit hobbit 5398528 Mar 19 00:51 tmp/core.23165 -rw------- 1 hobbit hobbit 5398528 Mar 19 00:51 tmp/core.22969 -rw------- 1 hobbit hobbit 4841472 Mar 19 00:53 tmp/core.23233 -rw------- 1 hobbit hobbit 4841472 Mar 19 00:53 tmp/core.23166 -rw------- 1 hobbit hobbit 3964928 Mar 19 00:58 tmp/core.23493 -rw------- 1 hobbit hobbit 3964928 Mar 19 00:58 tmp/core.23234 -rw------- 1 hobbit hobbit 3817472 Mar 19 01:03 tmp/core.23836 -rw------- 1 hobbit hobbit 3817472 Mar 19 01:03 tmp/core.23494 -rw------- 1 hobbit hobbit 54190080 Mar 19 01:07 tmp/core.22100 -rw------- 1 hobbit hobbit 5402624 Mar 19 01:12 tmp/core.24304 -rw------- 1 hobbit hobbit 5402624 Mar 19 01:12 tmp/core.24095 -rw------- 1 hobbit hobbit 4055040 Mar 19 01:13 tmp/core.24367 -rw------- 1 hobbit hobbit 4055040 Mar 19 01:13 tmp/core.24305
I am not sure I should post my gdb back trace here, but it has been dumping core for at least a week perhaps longer with different daily snapshots. We use the same configs on a much older snapshot with no problems. I am not sure of the date of the stable snapshot, the version is listed as Hobbit Monitor 4.3.0-0.20071026. Running Red Hat Enterprise 4.0. After a while it fills up the file system.
As a side note, I thought I reported this a few month or so ago, but the files column is mangled for some hosts, shows duplicate file entries like /etc/hosts listed twice or even 3 times on the web page.
Of course this means hobbitd is crashing, stopping?
[hobbit at hobbit2 logs]$ cat hobbitlaunch.log 2008-03-19 00:22:14 hobbitlaunch starting 2008-03-19 00:22:14 Loading tasklist configuration from /home/hobbit/server/etc/hobbitlaunch.cfg 2008-03-19 00:22:14 Loading hostnames 2008-03-19 00:22:14 Loading saved state 2008-03-19 00:22:15 Setting up network listener on 0.0.0.0:1984 2008-03-19 00:22:15 Setting up local listener 2008-03-19 00:22:15 Setting up signal handlers 2008-03-19 00:22:15 Setting up hobbitd channels 2008-03-19 00:22:15 Setting up logfiles 2008-03-19 00:31:46 Task hobbitd terminated by signal 6 2008-03-19 00:31:46 Loading hostnames 2008-03-19 00:31:46 Loading saved state 2008-03-19 00:31:47 Setting up network listener on 0.0.0.0:1984 2008-03-19 00:31:47 Setting up local listener 2008-03-19 00:31:47 Setting up signal handlers 2008-03-19 00:31:47 Setting up hobbitd channels 2008-03-19 00:31:47 Setting up logfiles 2008-03-19 01:07:56 Task hobbitd terminated by signal 6 2008-03-19 01:07:56 Task bbnet terminated by signal 15 2008-03-19 01:07:56 Loading hostnames 2008-03-19 01:07:57 Loading saved state 2008-03-19 01:07:57 Setting up network listener on 0.0.0.0:1984 2008-03-19 01:07:57 Setting up local listener 2008-03-19 01:07:57 Setting up signal handlers 2008-03-19 01:07:57 Setting up hobbitd channels 2008-03-19 01:07:57 Setting up logfiles 2008-03-19 01:17:55 Task hobbitd terminated by signal 6 2008-03-19 01:17:55 Task bbnet terminated by signal 15 2008-03-19 01:17:55 Loading hostnames 2008-03-19 01:17:55 Loading saved state 2008-03-19 01:17:56 Setting up network listener on 0.0.0.0:1984 2008-03-19 01:17:56 Setting up local listener 2008-03-19 01:17:56 Setting up signal handlers 2008-03-19 01:17:56 Setting up hobbitd channels 2008-03-19 01:17:56 Setting up logfiles 2008-03-19 01:21:32 Task hobbitd terminated by signal 6 2008-03-19 01:21:33 Loading hostnames 2008-03-19 01:21:33 Loading saved state 2008-03-19 01:21:33 Setting up network listener on 0.0.0.0:1984 2008-03-19 01:21:33 Setting up local listener 2008-03-19 01:21:33 Setting up signal handlers 2008-03-19 01:21:33 Setting up hobbitd channels 2008-03-19 01:21:33 Setting up logfiles
Perhaps this helps:
[hobbit at hobbit2 logs]$ cat clientdata.log 2008-03-19 00:22:20 Peer not up, flushing message queue 2008-03-19 00:23:52 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:23:52 Peer not up, flushing message queue 2008-03-19 00:27:20 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:27:22 Peer not up, flushing message queue 2008-03-19 00:31:53 Peer not up, flushing message queue 2008-03-19 00:32:23 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:32:23 Peer not up, flushing message queue 2008-03-19 00:32:25 Peer not up, flushing message queue 2008-03-19 00:32:26 Peer not up, flushing message queue 2008-03-19 00:32:29 Peer not up, flushing message queue 2008-03-19 00:32:32 Peer not up, flushing message queue 2008-03-19 00:32:34 Peer not up, flushing message queue 2008-03-19 00:32:37 Peer not up, flushing message queue 2008-03-19 00:32:38 Peer not up, flushing message queue 2008-03-19 00:32:42 Peer not up, flushing message queue 2008-03-19 00:32:44 Peer not up, flushing message queue 2008-03-19 00:32:45 Peer not up, flushing message queue 2008-03-19 00:32:46 Peer not up, flushing message queue 2008-03-19 00:32:48 Peer not up, flushing message queue 2008-03-19 00:32:50 Peer not up, flushing message queue 2008-03-19 00:36:42 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:36:44 Peer not up, flushing message queue 2008-03-19 00:38:11 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:38:12 Peer not up, flushing message queue 2008-03-19 00:43:53 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:43:54 Peer not up, flushing message queue 2008-03-19 00:44:42 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:44:44 Peer not up, flushing message queue 2008-03-19 00:44:44 Peer not up, flushing message queue 2008-03-19 00:44:44 Peer not up, flushing message queue 2008-03-19 00:44:45 Peer not up, flushing message queue 2008-03-19 00:44:45 Peer not up, flushing message queue 2008-03-19 00:44:46 Peer not up, flushing message queue 2008-03-19 00:44:49 Peer not up, flushing message queue 2008-03-19 00:44:50 Peer not up, flushing message queue 2008-03-19 00:44:52 Peer not up, flushing message queue 2008-03-19 00:44:53 Peer not up, flushing message queue 2008-03-19 00:48:50 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:48:51 Peer not up, flushing message queue 2008-03-19 00:51:55 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:51:56 Peer not up, flushing message queue 2008-03-19 00:53:54 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:53:56 Peer not up, flushing message queue 2008-03-19 00:58:54 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 00:58:55 Peer not up, flushing message queue 2008-03-19 01:03:56 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 01:03:56 Peer not up, flushing message queue 2008-03-19 01:08:02 Peer not up, flushing message queue 2008-03-19 01:12:24 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 01:12:24 Peer not up, flushing message queue 2008-03-19 01:13:56 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 01:13:57 Peer not up, flushing message queue 2008-03-19 01:17:17 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 01:17:19 Peer not up, flushing message queue 2008-03-19 01:18:03 Peer not up, flushing message queue 2008-03-19 01:18:58 Peer at 0.0.0.0:0 failed: Broken pipe 2008-03-19 01:18:59 Peer not up, flushing message queue 2008-03-19 01:19:00 Peer not up, flushing message queue 2008-03-19 01:19:01 Peer not up, flushing message queue 2008-03-19 01:19:02 Peer not up, flushing message queue 2008-03-19 01:19:02 Peer not up, flushing message queue 2008-03-19 01:21:38 Peer not up, flushing message queue
David
Hi,
Gore, David W (David) wrote:
As a side note, I thought I reported this a few month or so ago, but the files column is mangled for some hosts, shows duplicate file entries like /etc/hosts listed twice or even 3 times on the web page.
I had the same problem with one of the February snapshots and had to return to a snapshot from November. I reported this to the list but didn't receive an answer.
-- Regards,
Dirk Kastens Universitaet Osnabrueck, Rechenzentrum (Computer Center) Albrechtstr. 28, 49069 Osnabrueck, Germany Tel.: +49-541-969-2347, FAX: -2470
Seems several of your Hobbit programs are dumping core. These three cores are probably not from the same program, since they are so different in size:
-rw------- 1 hobbit hobbit 5210112 Mar 19 00:23 tmp/core.21600 -rw------- 1 hobbit hobbit 4038656 Mar 19 00:27 tmp/core.21602 -rw------- 1 hobbit hobbit 49213440 Mar 19 00:31 tmp/core.21520
I'd suspect that .21520 core was from hobbitd, the timestamp matches the log-file entry indicating that hobbitd has crashed.
Before doing anything else, please re-build Hobbit - and run a "make clean" before doing the "make; make install". If the problem persists after that, I would like to see the gdb backtrace from the different programs that crash.
I am not sure I should post my gdb back trace here, but it has been dumping core for at least a week perhaps longer with different daily snapshots. We use the same configs on a much older snapshot with no problems. I am not sure of the date of the stable snapshot
The best way is to extract the version-numbers from each of the source files like this:
$ strings ~hobbit/server/bin/hobbitd|grep \$Id: $Id: hobbitd.c,v 1.279 2008/03/02 12:49:40 henrik Exp henrik $ $Id: hobbitd_buffer.c,v 1.10 2008/01/03 10:08:13 henrik Exp $ ...lots more lines...
Regards, Henrik
-----Original Message----- From: Henrik Stoerner [mailto:henrik at hswn.dk] Sent: Wednesday, March 19, 2008 08:08 To: hobbit at hswn.dk Subject: Re: [hobbit] Bug latest snapshot, hobbitd_client
Seems several of your Hobbit programs are dumping core. These three cores are probably not from the same program, since they are so different in size:
-rw------- 1 hobbit hobbit 5210112 Mar 19 00:23 tmp/core.21600 -rw------- 1 hobbit hobbit 4038656 Mar 19 00:27 tmp/core.21602 -rw------- 1 hobbit hobbit 49213440 Mar 19 00:31 tmp/core.21520
I'd suspect that .21520 core was from hobbitd, the timestamp matches the log-file entry indicating that hobbitd has crashed.
Before doing anything else, please re-build Hobbit - and run a "make clean" before doing the "make; make install". If the problem persists after that, I would like to see the gdb backtrace from the different programs that crash.
Each build from the snapshot is from scratch, so it will be clean from the start.
I am not sure I should post my gdb back trace here, but it has been dumping core for at least a week perhaps longer with different daily snapshots. We use the same configs on a much older snapshot with no problems. I am not sure of the date of the stable snapshot
The best way is to extract the version-numbers from each of the source files like this:
$ strings ~hobbit/server/bin/hobbitd|grep \$Id: $Id: hobbitd.c,v 1.279 2008/03/02 12:49:40 henrik Exp henrik $ $Id: hobbitd_buffer.c,v 1.10 2008/01/03 10:08:13 henrik Exp $ ...lots more lines...
[hobbit at hobbit2 tmp]$ file core.21600 core.21602 core.21520 core.21600: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'hobbitd_client' core.21602: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'hobbitd_client' core.21520: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'hobbitd'
Sorry, the grep for $Id: is not in the strings output. I sent you the backtrace in a separate e-mail.
Just as an FYI, I am still disappointed in tooltips not working properly. It really isn't practical to vertically scroll the window because descriptions or comments are making the window really wide. When you have the column headings in view you cannot tell what host the alarm is for because the hosts scrolled off the left side of the window.
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
participants (3)
-
david.gore@verizonbusiness.com
-
Dirk.Kastens@uni-osnabrueck.de
-
henrik@hswn.dk