Every couple of days I note that hobbitfetch is hung: yellow Load is HIGH System clock is 0 seconds off
top - 07:23:48 up 67 days, 15:36, 0 users, load average: 3.57, 5.42, 5.18 Tasks: 94 total, 4 running, 90 sleeping, 0 stopped, 0 zombie Cpu(s): 47.8% us, 6.5% sy, 0.3% ni, 43.0% id, 1.8% wa, 0.1% hi, 0.5% si Mem: 2074580k total, 2008140k used, 66440k free, 46652k buffers Swap: 1052216k total, 2576k used, 1049640k free, 1325888k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30444 hobbit 25 0 2020 1016 616 R 96.3 0.0 3869:03 hobbitfetch
I then go kill -9 the hobbitfetch process, and it starts "working" again, but I get an error reported in the hobbitfetch status: - Program crashed Fatal signal caught!
Nothing is written to hobbitfetch.log. But I do get correct statuses for the few hosts that I am polling.
I have 4 hosts that I poll using hobbitfetch. I think there will be just two more. But they are all internet facing, so keeping them up is a priority...
I would appreciate any suggestions as to where to begin troubleshooting this, as I really need this to work.
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
On Sat, Jan 27, 2007 at 11:43:58AM -0600, Daniel J McDonald wrote:
Every couple of days I note that hobbitfetch is hung: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30444 hobbit 25 0 2020 1016 616 R 96.3 0.0 3869:03 hobbitfetchI then go kill -9 the hobbitfetch process, and it starts "working" again,
Next time, please do a "kill -6" and run the resulting coredump through gdb to get the stacktrace (see the "Reporting bugs" online help).
If possible, I'd also like to have a copy of the coredump and the hobbitfetch binary.
Regards, Henrik
On Sat, 2007-01-27 at 23:31 +0100, Henrik Stoerner wrote:
On Sat, Jan 27, 2007 at 11:43:58AM -0600, Daniel J McDonald wrote:
Every couple of days I note that hobbitfetch is hung: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30444 hobbit 25 0 2020 1016 616 R 96.3 0.0 3869:03 hobbitfetchI then go kill -9 the hobbitfetch process, and it starts "working" again,
Next time, please do a "kill -6" and run the resulting coredump through gdb to get the stacktrace (see the "Reporting bugs" online help).
If possible, I'd also like to have a copy of the coredump and the hobbitfetch binary.
I've sent you a couple of those in the past...
On a whim, I specified IP addresses for those servers (rather than 0.0.0.0) and the infinite loop issue went away. I still get the
- Program crashed Fatal signal caught!
message on the hobbitfetch monitoring page, but it is still pulling down data much of the time.
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
participants (2)
-
dan.mcdonald@austinenergy.com
-
henrik@hswn.dk