Hi Henrik
Thanks for helping on this. I rebooted this morning. Could the memory leak still effect me in that short time?
No "failed allocation" in dmesg output. Do you want the full output?
[root at pengo log]# vmstat 4 20 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 67916 14428 92136 0 0 19 5 1025 161 1 1 98 1 0 0 0 67852 14428 92136 0 0 0 0 1024 150 0 1 99 0 0 0 0 67852 14436 92136 0 0 0 5 1031 157 0 1 99 0 0 0 0 67852 14444 92136 0 0 0 12 1028 148 0 0 100 0 0 0 0 67852 14444 92136 0 0 0 2 1025 152 0 0 99 0 0 0 0 67852 14448 92136 0 0 0 1 1024 154 0 1 99 0 0 0 0 67852 14448 92136 0 0 0 0 1026 145 0 1 100 0 0 0 0 67796 14448 92136 0 0 0 0 1028 157 1 1 99 0 0 0 0 67796 14448 92136 0 0 0 0 1023 149 0 0 100 0 0 0 0 67796 14456 92136 0 0 0 3 1024 155 1 1 99 0 0 0 0 67796 14456 92140 0 0 0 0 1037 157 0 1 99 0 0 0 0 67796 14468 92140 0 0 0 17 1026 150 0 1 99 0 0 0 0 67796 14468 92140 0 0 0 0 1022 157 0 1 99 0 0 0 0 67796 14476 92140 0 0 0 4 1022 148 0 0 100 0 0 0 0 67796 14476 92140 0 0 0 0 1023 157 1 1 99 0 0 0 0 67796 14476 92140 0 0 0 0 1021 152 0 1 100 0 0 0 0 67796 14476 92140 0 0 0 0 1019 147 1 0 99 0 0 0 0 67796 14476 92140 0 0 0 6 1026 153 0 1 99 0 0 0 0 67796 14492 92140 0 0 2 12 1023 151 0 0 92 8 0 0 0 67796 14492 92140 0 0 0 0 1024 155 0 1 99 0
All these commands returned to command prompt with the following error message. As user hobbit [hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard" 2005-07-01 15:21:45 Whoops ! bb failed to send message - timeout
[hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard host=pengo" 2005-07-01 15:21:00 Whoops ! bb failed to send message - timeout [hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard host=pengo.afgonlin.com.au" 2005-07-01 15:21:30 Whoops ! bb failed to send message - timeout
As root [root at pengo log]# /usr/lib/hobbit/server/bin/bb 127.0.0.1 "hobbitdboard" 2005-07-01 15:17:41 Whoops ! bb failed to send message - timeout
[root at pengo log]# /usr/lib/hobbit/server/bin/bb 127.0.0.1 "hobbitdboard host=pengo" 2005-07-01 15:18:35 Whoops ! bb failed to send message - timeout [root at pengo log]# /usr/lib/hobbit/server/bin/bb 127.0.0.1 "hobbitdboard host=pengo.afgonlin.com.au" 2005-07-01 15:18:48 Whoops ! bb failed to send message - timeout
-----Original Message----- From: Henrik Stoerner [mailto:henrik at hswn.dk] Sent: Friday, 1 July 2005 2:40 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Status Unavailable
On Fri, Jul 01, 2005 at 01:51:47PM +0800, Vernon Everett wrote:
I removed the HEARTBEAT ine, and restarted. No change. :-(
In case it helps, I am runnig Mandrake 10.1
OK - something similar did happen on my own system a few days ago, but it was so bizarre I wonder if it could happen on two boxes in the same week :-) The Linux kernel was leaking memory, so eventually it ran out of network bufferspace and Hobbit couldn't send responses anywhere.
Could you try running "dmesg" and see if there are any "failed allocation" messages at the bottom ? This should really only list the messages you see during boot-up and any hardware detection that has happened.
Also, send me a "vmstat 4 20" output from the box.
If you run
~hobbit/server/bin/bb 127.0.0.1 "hobbitdboard"
does that hang ? What if you do
~hobbit/server/bin/bb 127.0.0.1 "hobbitdboard host=YOUR.HOBBIT.HOSTNAME"
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This message and any attachments are confidential and may contain copyright material of Australian Finance Group Limited or a third party. It is intended solely for the purpose of the addressee and any other named recipient. If you are not the intended recipient, any use, distribution, disclosure or copying of this message is strictly prohibited. The confidentiality attached to this message is not waived or lost by reason of the mistaken transmission or delivery to any unintended party. If you have received this message in error, please notify the author immediately or contact Australian Finance Group on +61 8 9420 7888.
On Fri, Jul 01, 2005 at 03:25:30PM +0800, Vernon Everett wrote:
Thanks for helping on this. I rebooted this morning. Could the memory leak still effect me in that short time?
Probably not. Just wanted to rule out this possibility.
No "failed allocation" in dmesg output. Do you want the full output?
No, I dont think that is necessary.
[root at pengo log]# vmstat 4 20
And your system is mostly idle with no swap or disk activity.
[hobbit at pengo hobbit]$ server/bin/bb 127.0.0.1 "hobbitdboard" 2005-07-01 15:21:45 Whoops ! bb failed to send message - timeout
Could you try running "strace -p <process-ID of the hobbitd process>" for a minute or two and send me the output, then do a "kill -6 <process-id>" and mail me the core-file from ~hobbit/server/tmp/ together with the ~hobbit/server/bin/hobbitd file ?
Also, after this try adding a "--debug" to the hobbitd commandline in hobbitlaunch.cfg. Let it run for a while and then mail me the hobbitd.log file.
This bug sounds a bit nasty, I think ....
Regards, Henrik
participants (2)
-
henrik@hswn.dk
-
v.everett@afgonline.com.au