Worker process teminating
All were fine for a month or two until I get this error in /var/log/hobbit/clientdata.log (without any change in hobbit configuration)
2006-10-02 21:35:00. Worker process died with exit code 134, terminating
and I am getting purple message for all hobitclient messages from that server.
Using hobbit 4.2
gdb bin/hobbitd_client tmp/core.2790 GNU gdb Red Hat Linux (6.3.0.0-1.21rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `hobbitd_client'. Program terminated with signal 6, Aborted. Reading symbols from /lib64/libpcre.so.0...done. Loaded symbols for /lib64/libpcre.so.0 Reading symbols from /lib64/libc.so.6...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 #0 0x0000003bd882f3b0 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x0000003bd882f3b0 in raise () from /lib64/libc.so.6 #1 0x0000003bd8830860 in abort () from /lib64/libc.so.6 #2 0x0000000000417953 in sigsegv_handler (signum=Variable "signum" is not available. ) at sig.c:57 #3 <signal handler called> #4 0x0000003bd8872250 in strlen () from /lib64/libc.so.6 #5 0x000000000040b660 in addalertgroup (group=0x6 <Address 0x6 out of bounds>) at client_config.c:366 #6 0x000000000040643a in unix_disk_report ( hostname=0x2aaaaab1fd8c "wisprddb1", clientclass=0x2aaaaab1fd9c "sunos", os=OS_SOLARIS, hinfo=0x528820, fromline=0x7fffff8c7e40 "\nStatus message received from 10.1.201.241\n", timestr=0x2aaaaab1fdc6 "Mon Oct 2 21:33:11 PDT 2006", freehdr=0x41adef "avail", capahdr=0x41ade6 "capacity", mnthdr=0x41ad66 "Mounted", dfstr=0x2aaaaab1ffe9 "Filesystem", ' ' <repeats 12 times>, "kbytes used avail capacity Mounted on\nswap", ' ' <repeats 17 times>, "20532336 1112 20531224 1% /etc/svc/volatile\nswap", ' ' <repeats 17 times>, "20589984 58760 20531224 1% /t"...) at hobbitd_client.c:452 #7 0x0000000000407619 in handle_solaris_client ( hostname=0x2aaaaab1fd8c "wisprddb1", clienttype=0x2aaaaab1fd9c "sunos", os=OS_SOLARIS, hinfo=0x528820, sender=Variable "sender" is not available. ) at client/solaris.c:60 #8 0x000000000040a714 in main (argc=Variable "argc" is not available. ) at hobbitd_client.c:1787 (gdb)
-- Please help to solve this issue..
Thanks Ram
Get your email and more, right on the new Yahoo.com
How can hobbit client or its external module do NFS monitoring ?
Is there a way to monitor remote nfs server and its partition from hobbit server ?
command like showmount looks pretty handy to achieve the goal.
[root] showmount -e mynfsserver export list for mynfsserver: /export (everyone) /opt/sge (everyone) [root]
Also is there a way to monitor stale NFS issue by bb or hobbit client ? right now I am using cron job to detect nfs parition is out of reach by ls command.
17,55 * * * * (ok=ls /nfsserver/README.txt; if [ $? -eq 1 ];then echo
"automounter has problem;run /etc/init.d/autofs reload" | /bin/mail -s "ls
/nfsserver/README.txt failed on t-myhost" test at test.com ;fi;)
T.J. Yang
On Tuesday 03 October 2006 23:24, T.J. Yang wrote:
How can hobbit client or its external module do NFS monitoring ?
Hobbit can monitor rpc services, see bb-hosts(5). But, this only checks that the rpc program is registered on the host.
Is there a way to monitor remote nfs server and its partition from hobbit server ?
command like showmount looks pretty handy to achieve the goal.
[root] showmount -e mynfsserver export list for mynfsserver: /export (everyone) /opt/sge (everyone) [root]
Sure, so you could write a trivial extension script for the server that does this.
Also is there a way to monitor stale NFS issue by bb or hobbit client ? right now I am using cron job to detect nfs parition is out of reach by ls command.
17,55 * * * * (ok=
ls /nfsserver/README.txt; if [ $? -eq 1 ];then echo "automounter has problem;run /etc/init.d/autofs reload" | /bin/mail -s "ls /nfsserver/README.txt failed on t-myhost" test at test.com ;fi;)
This could also be an extension script on any hobbit client (and, it would then also be able to test for being able to write to certain NFS shares).
You could also use a "file" monitor on any of your NFS clients.
Regards, Buchan
-- Buchan Milne ISP Systems Specialist - Monitoring/Authentication Team Leader B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)
participants (3)
-
bgmilne@staff.telkomsa.net
-
grsamy@yahoo.com
-
tj_yang@hotmail.com