Channel processing problem with 4.11
Testing in two separate hobbit environments. Both environments are able to process BB-PE traffic. Using Hobbit 4.11 server release and a simple client test file to transmit status to the servers (modified hobbitclient-linux.sh and hobbitclient-hpux.sh).
First environment using Mandrake 9.0 server and Fedora Core 2 client - client transmissions received and processed properly on server.
Second environment using Fedora Core 3 server and HP-UX 11i client, have problems with hobbitd_channel on server.
Initial error was: Worker process died with exit code 6, terminating
I reduced down the size of the HP-UX client message - no longer sending ps/top/vmstat output; still blowing up.
Commented out some of the signal handler lines and set to ignore some of the signals to drill into problem in the hobbitd_channel.c file. /* sigaction(SIGPIPE, &sa, NULL); */ signal(SIGPIPE, SIG_IGN); /* sigaction(SIGINT, &sa, NULL); */ signal(SIGINT, SIG_IGN); sigaction(SIGTERM, &sa, NULL); /* sigaction(SIGCHLD, &sa, NULL); */ signal(SIGCHLD, SIG_IGN);
Rerun, error message is now: Our child has failed and will not talk to us....
Guessing that may have a blocking problem - even though there is only the server and the one client using the channel, increased the sleep value.
else if (errno == EAGAIN) { /* * Write would block ... stop for now. * Wait just a little while before continuing, so we * dont do busy-waiting when the worker child is not * accepting more data. */ canwrite = 0; /* usleep(2500); */ usleep(25000)
Same error. Any ideas what to look into next?
Also, has anyone configured hobbit server to work with a number of HPUX clients? Looking to handle around 50 HPUX servers and 50 Windows servers into a single Hobbit server.
In <FB13116A8C464943B4A5436A616C95F80530AB1F at rocexu01> "Deiss, Mark" <Mark.Deiss at acs-inc.com> writes:
First environment using Mandrake 9.0 server and Fedora Core 2 client - client transmissions received and processed properly on server.
Second environment using Fedora Core 3 server and HP-UX 11i client, have problems with hobbitd_channel on server.
Initial error was: Worker process died with exit code 6, terminating
This means that the hobbitd_client program has crashed. There ought to be a core-dump in the ~hobbit/server/tmp/ directory; if you could run it through the procedure described in http://www.hswn.dk/hobbit/help/known-issues.html#bugreport it would make it simpler to find.
I reduced down the size of the HP-UX client message - no longer sending ps/top/vmstat output; still blowing up.
Please send me a copy of the client message. You'll find it in ~hobbit/client/tmp/msg.txt on the HP-UX server. I've had one other report of HP-UX clients causing the hobbitd_client module to crash, so there is probably something special about the client messages from HP-UX based systems that trigger this.
Also, has anyone configured hobbit server to work with a number of HPUX clients? Looking to handle around 50 HPUX servers and 50 Windows servers into a single Hobbit server.
Shouldn't be a problem. I have about 1500 clients reporting into one Hobbit server (HP-UX, Solaris, Windows, AIX).
Henrik
participants (2)
-
henrik@hswn.dk
-
Mark.Deiss@acs-inc.com