Banging on it and no core dumps (yet). The hobbitd client channel process appears to resurrect off the parent channel process for the signal 6 faults (SIGABRT). Probably be able to get a core dump faster if re-disable the signals and pound the channel some more.
I have narrowed down the fault in the client message to the first line in the df output (how strange). Below is the client side code where only reporting the initial client line and then the [df] block header and the df output. If the HPUX df header line is removed before sending out message, no error on channel.
Using the hobbit bb binary for client side transmission. For the tests below, restored back the original hobbitd_channel binary with the standard signal handlers. Playing around with the content of the df header line to try and figure out what is so special about it. Very confusing, guess will go back in and disable the signal handlers and try and force out a core dump.
!/bin/sh
#--------------------------------------------------------------------------- -#
HP-UX client for Hobbit
Copyright (C) 2005 Henrik Storner <henrik at hswn.dk>
This program is released under the GNU General Public License (GPL),
version 2. See the file "COPYING" for details.
#--------------------------------------------------------------------------- -#
$Id: hobbitclient-hpux.sh,v 1.4 2005/07/24 11:32:51 henrik Exp $
MACHINE=/usr/bin/hostname
no good - uname reports HP-UX, hobbit code uses hpux, need to ditch the
minus sign
#BBOSTYPE="/usr/bin/uname -s | /usr/bin/tr '[A-Z]' '[a-z]'"
BBOSTYPE="hpux"
{ echo "client $MACHINE.$BBOSTYPE" ##echo "[date]" ##/usr/bin/date
echo "[df]"
the default does not work - header and metrics
#/usr/bin/df -Pk
the next line works - report filesystem metrics without a header line
#/usr/bin/df -Pk | sed -ne '2,$p'
the next line causes error on channel - just printing header line
/usr/bin/df -Pk | sed -ne '1p'
##echo "[memory]"
##echo "Total:./bb-hp-memsz -p"
##echo "Free:./bb-hp-memsz -f"
} | ./bb --debug aaa.bbb.ccc.ddd "@"
exit
The message body that results in a channel error is (grabbing snapshot on client - the server client/tmp/msg.txt keeps catching the content of the server's local check):
client csdaj401.hpux [df] Filesystem 1024-blocks Used Available Capacity Mounted on
-----Original Message----- From: Henrik Storner [mailto:henrik at hswn.dk] Sent: Friday, September 30, 2005 9:07 AM To: hobbit at hswn.dk Subject: Re: [hobbit] Channel processing problem with 4.11
In <FB13116A8C464943B4A5436A616C95F80530AB1F at rocexu01> "Deiss, Mark" <Mark.Deiss at acs-inc.com> writes:
First environment using Mandrake 9.0 server and Fedora Core 2 client - client transmissions received and processed properly on server.
Second environment using Fedora Core 3 server and HP-UX 11i client, have problems with hobbitd_channel on server.
Initial error was: Worker process died with exit code 6, terminating
This means that the hobbitd_client program has crashed. There ought to be a core-dump in the ~hobbit/server/tmp/ directory; if you could run it through the procedure described in http://www.hswn.dk/hobbit/help/known-issues.html#bugreport it would make it simpler to find.
I reduced down the size of the HP-UX client message - no longer sending ps/top/vmstat output; still blowing up.
Please send me a copy of the client message. You'll find it in ~hobbit/client/tmp/msg.txt on the HP-UX server. I've had one other report of HP-UX clients causing the hobbitd_client module to crash, so there is probably something special about the client messages from HP-UX based systems that trigger this.
Also, has anyone configured hobbit server to work with a number of HPUX clients? Looking to handle around 50 HPUX servers and 50 Windows servers into a single Hobbit server.
Shouldn't be a problem. I have about 1500 clients reporting into one Hobbit server (HP-UX, Solaris, Windows, AIX).
Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Fri, Sep 30, 2005 at 10:47:39AM -0500, Deiss, Mark wrote:
I have narrowed down the fault in the client message to the first line in the df output (how strange). Below is the client side code where only reporting the initial client line and then the [df] block header and the df output. If the HPUX df header line is removed before sending out message, no error on channel.
I just had another report about hobbitd_client crashing, also with the "df" reports. In that case I was able to track it down and the attached patch should fix it (the patch is on top of the current snapshot). Could you try if this fixes it ? It might be the same problem.
Regards, Henrik
participants (2)
-
henrik@hswn.dk
-
Mark.Deiss@acs-inc.com