Hi
I am doing a migration from BigBrother to Hobbit. I have installed Hobbit on a new dedicated Linux server and configured the bb-hosts and started Hobbit. First everything seemed to work. I had nice results from ping, http and dns tests. After that I did not even touch the Hobbit system. Then it all suddenly stopped. All my network tests are purple. Bb-network.log is zero bytes. The ONLY error message that I can find is "df: cannot read table of mounted file systems: Permission denied" and that is not a problem. There's also "hobbitd status-board not available" in the Hobbit server log. I guess something must be wrong with bbtest-net. Any ideas?
PID PPID STIME ELAPSED COMMAND 18428 1 Nov30 5-03:06:20 /home/hobbit/server/bin/hobbitlaunch --config=/home/hobbit/server/etc/hobbitlaunch.cfg --env=/h 18429 18428 Nov30 5-03:06:20 hobbitd --pidfile=/var/log/hobbit/hobbitd.pid --restart=/home/hobbit/server/tmp/hobbitd.chk --c 18430 18428 Nov30 5-03:06:15 hobbitd_channel --channel=stachg --log=/var/log/hobbit/history.log hobbitd_history 18431 18430 Nov30 5-03:06:15 hobbitd_history 18432 18428 Nov30 5-03:06:15 hobbitd_channel --channel=clichg --log=/var/log/hobbit/hostdata.log hobbitd_hostdata 18433 18432 Nov30 5-03:06:15 hobbitd_hostdata 18434 18428 Nov30 5-03:06:15 hobbitd_channel --channel=page --log=/var/log/hobbit/page.log hobbitd_alert --checkpoint-file=/ 18435 18434 Nov30 5-03:06:15 hobbitd_alert --checkpoint-file=/home/hobbit/server/tmp/alert.chk --checkpoint-interval=600 18436 18428 Nov30 5-03:06:15 hobbitd_channel --channel=status --log=/var/log/hobbit/rrd-status.log hobbitd_rrd --rrddir=/hom 18437 18428 Nov30 5-03:06:15 hobbitd_channel --channel=data --log=/var/log/hobbit/rrd-data.log hobbitd_rrd --rrddir=/home/ho 18438 18436 Nov30 5-03:06:15 hobbitd_rrd --rrddir=/home/hobbit/data/rrd 18439 18437 Nov30 5-03:06:15 hobbitd_rrd --rrddir=/home/hobbit/data/rrd 18440 18428 Nov30 5-03:06:15 hobbitd_channel --channel=client --log=/var/log/hobbit/clientdata.log hobbitd_client 18441 18440 Nov30 5-03:06:15 hobbitd_client 26129 18428 Dec04 22:35:09 bbtest-net --report --ping --checkresponse 9286 1 14:36 01:24 sh -c vmstat 300 2 1>/home/hobbit/client/tmp/hobbit_vmstat.slfitkusap007.9269 2>&1; mv /home/ho 9287 9286 14:36 01:24 vmstat 300 2
Hi All,
Have been working with clientupdate to get our server's RE-IP to go smoothly but am having problems, I put the tar files into ~/download and alter client-local.cfg to do:
[Foo.bar]
ClientID=server_move
The file under ~/download is call server_move.tar and has 777 permissions
The problem is it isn't working and I can't see a reason why, when on foo.bar itself I run:
~/bin/clientupdate -update=server_move
And get the output:
tar: blocksize = 0
even adding .tar to the end of the -update does nothing, is there any other way I can see what is happening and why it is going wrong?
Thanks,
Jason.
(note: - is -- outlook just does some formatting crap)
The plot thickens, I can now do a clientupdate...sort of, for some damnable reason it puts itself under /usr/local/hobbit/client/hobbit/client/ (the repeated hobbit/client is not a mistake) any ideas why it would do this?? Running tar xvf on the files on the server creates the directories in the same directory as the tar (i.e. no hobbit/client) so why is it doing it on the client side? Do I need to alter BBHOME somehow?
Thanks,
Jason.
From: Jones, Jason (Altrincham) Sent: 05 December 2006 13:54 To: hobbit at hswn.dk Subject: [hobbit] clientupdate help
Hi All,
Have been working with clientupdate to get our server's RE-IP to go smoothly but am having problems, I put the tar files into ~/download and alter client-local.cfg to do:
[Foo.bar]
ClientID=server_move
The file under ~/download is call server_move.tar and has 777 permissions
The problem is it isn't working and I can't see a reason why, when on foo.bar itself I run:
~/bin/clientupdate -update=server_move
And get the output:
tar: blocksize = 0
even adding .tar to the end of the -update does nothing, is there any other way I can see what is happening and why it is going wrong?
Thanks,
Jason.
(note: - is -- outlook just does some formatting crap)
Sounds like you may have created your tar file incorrectly. The somewhat sketchy instructions say to create the tarball relative to the client directory, not root.
GLH
From: Jones, Jason (Altrincham)
[mailto:JasonAS_Jones at mentor.com] Sent: Tuesday, December 05, 2006 10:00 AM To: hobbit at hswn.dk Subject: RE: [hobbit] clientupdate help
The plot thickens, I can now do a clientupdate...sort of, for
some damnable reason it puts itself under /usr/local/hobbit/client/hobbit/client/ (the repeated hobbit/client is not a mistake) any ideas why it would do this?? Running tar xvf on the files on the server creates the directories in the same directory as the tar (i.e. no hobbit/client) so why is it doing it on the client side? Do I need to alter BBHOME somehow?
Thanks,
Jason.
From: Jones, Jason (Altrincham)
Sent: 05 December 2006 13:54
To: hobbit at hswn.dk
Subject: [hobbit] clientupdate help
Hi All,
Have been working with clientupdate to get our server's RE-IP to
go smoothly but am having problems, I put the tar files into ~/download and alter client-local.cfg to do:
[Foo.bar]
ClientID=server_move
The file under ~/download is call server_move.tar and has 777
permissions
The problem is it isn't working and I can't see a reason why,
when on foo.bar itself I run:
~/bin/clientupdate -update=server_move
And get the output:
tar: blocksize = 0
even adding .tar to the end of the -update does nothing, is
there any other way I can see what is happening and why it is going wrong?
Thanks,
Jason.
(note: - is -- outlook just does some formatting crap)
On Tuesday 05 December 2006 14:39, Strandell, Ralf wrote:
Hi
I am doing a migration from BigBrother to Hobbit. I have installed Hobbit on a new dedicated Linux server and configured the bb-hosts and started Hobbit. First everything seemed to work. I had nice results from ping, http and dns tests. After that I did not even touch the Hobbit system. Then it all suddenly stopped. All my network tests are purple. Bb-network.log is zero bytes. The ONLY error message that I can find is "df: cannot read table of mounted file systems: Permission denied" and that is not a problem. There's also "hobbitd status-board not available" in the Hobbit server log. I guess something must be wrong with bbtest-net. Any ideas?
Well, this can happen when the hobbit server runs out of disk space ...
-- Buchan Milne ISP Systems Specialist - Monitoring/Authentication Team Leader B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)
It was neither the IPC resources. The output of IPCS looks similar before and after a restart/recovery. So, if it wasn't disk/memory/cpu/ipcs, then what? I can't build an enterprise class monitoring system that just shuts down without any known cause.
What kind of filesystem access does Hobbit need? I have a bad habit of restricting access... Oh yes, I had chosen "paranoid" filesys security just because it was possible.
-----Original Message----- From: Buchan Milne [mailto:bgmilne at staff.telkomsa.net] Sent: Tuesday, December 05, 2006 4:56 PM To: hobbit at hswn.dk Cc: Strandell, Ralf Subject: Re: [hobbit] All network test suddenly purple
On Tuesday 05 December 2006 14:39, Strandell, Ralf wrote:
Hi
I am doing a migration from BigBrother to Hobbit. I have installed Hobbit on a new dedicated Linux server and configured the bb-hosts and
started Hobbit. First everything seemed to work. I had nice results from ping, http and dns tests. After that I did not even touch the Hobbit system. Then it all suddenly stopped. All my network tests are purple. Bb-network.log is zero bytes. The ONLY error message that I can find is "df: cannot read table of mounted file systems: Permission denied" and
that is not a problem. There's also "hobbitd status-board not available" in the Hobbit server log. I guess something must be wrong with bbtest-net. Any ideas?
Well, this can happen when the hobbit server runs out of disk space ...
-- Buchan Milne ISP Systems Specialist - Monitoring/Authentication Team Leader B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)
On Tue, 2006-12-05 at 18:09 +0200, Strandell, Ralf wrote:
It was neither the IPC resources. The output of IPCS looks similar before and after a restart/recovery. So, if it wasn't disk/memory/cpu/ipcs, then what? I can't build an enterprise class monitoring system that just shuts down without any known cause.
What kind of filesystem access does Hobbit need? I have a bad habit of restricting access... Oh yes, I had chosen "paranoid" filesys security just because it was possible.
Ah, the old "msec changed my permissions to something completely bizzare and now it won't run" problem.
I put hobbit clients in the adm and ntools groups, and change the permissions on /etc/mandriva-release and /var/log/messages to be able to be read by the adm group (in /etc/security/msec/perm.local). Also, for server, hobbitping has to be installed setuid (just like fping).
Hobbit server needs at least to be able to write to its subdirectories in /var/lib/hobbit/... Look for a corefile - if it crashed, there will be a core in /var/lib/hobbit/server/ or maybe a level deeper.
And check disk space: use the df command.
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
No, unfortunately not.
I have changed the permissions to "easy" in Yast (Suse Linux), and the network test still performs one poll only.
Hobbitping is suid root. Fping is ran with sudo, and sudo is suid root. Hobbitping works: User "hobbit" can use it to ping ip addresses. It fails to ping hostnames. The /home/hobbit and /var/log/hobbit directory trees is writable. The bbtest page does not update either - it really looks like the network test would not run at all.
Is there a way to debug the network test to find out what is going on?
-----Original Message----- From: Daniel J McDonald [mailto:dan.mcdonald at austinenergy.com] Sent: Tuesday, December 05, 2006 7:28 PM To: hobbit at hswn.dk Subject: RE: [hobbit] All network test suddenly purple
On Tue, 2006-12-05 at 18:09 +0200, Strandell, Ralf wrote:
It was neither the IPC resources. The output of IPCS looks similar before and after a restart/recovery. So, if it wasn't disk/memory/cpu/ipcs, then what? I can't build an enterprise class monitoring system that just shuts down without any known cause.
What kind of filesystem access does Hobbit need? I have a bad habit of
restricting access... Oh yes, I had chosen "paranoid" filesys security
just because it was possible.
Ah, the old "msec changed my permissions to something completely bizzare and now it won't run" problem.
I put hobbit clients in the adm and ntools groups, and change the permissions on /etc/mandriva-release and /var/log/messages to be able to be read by the adm group (in /etc/security/msec/perm.local). Also, for server, hobbitping has to be installed setuid (just like fping).
Hobbit server needs at least to be able to write to its subdirectories in /var/lib/hobbit/... Look for a corefile - if it crashed, there will be a core in /var/lib/hobbit/server/ or maybe a level deeper.
And check disk space: use the df command.
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
"It fails to ping hostnames."
I have seen the network test stall out if there is a DNS problem, even if the "testip" flag is set. If you turn on debug for the bb-network test (I forget the exact name, but you can look it up in the man pages) then you can see if the tests are getting stuck on a reverse lookup. Even though the Ares code is supposed to time out if a DNS server does not respond, this seems to be wishful thinking in the documentation. My system was a Sun Solaris system.
Don't know how many hosts you are monitoring, but if you were to add some to your hobbit server /etc/hosts file, you might be able to see if this is a DNS issue.
I guess you could check your resolver configuration (/etc/resolv.conf?) and whatever it is on your system that determines precedence for naming services (on Solaris it is /etc/nsswitch.conf, not sure about your brand of Linux).
A shot in the dark, but based on something that bedeviled me for a couple of days...
GLH
-----Original Message----- From: Strandell, Ralf [mailto:Ralf.Strandell at silja.com] Sent: Thursday, December 07, 2006 9:13 AM To: hobbit at hswn.dk Subject: RE: [hobbit] All network test suddenly purple
No, unfortunately not.
I have changed the permissions to "easy" in Yast (Suse Linux), and the network test still performs one poll only.
Hobbitping is suid root. Fping is ran with sudo, and sudo is suid root. Hobbitping works: User "hobbit" can use it to ping ip addresses. It fails to ping hostnames. The /home/hobbit and /var/log/hobbit directory trees is writable. The bbtest page does not update either - it really looks like the network test would not run at all.
Is there a way to debug the network test to find out what is going on?
-----Original Message----- From: Daniel J McDonald [mailto:dan.mcdonald at austinenergy.com] Sent: Tuesday, December 05, 2006 7:28 PM To: hobbit at hswn.dk Subject: RE: [hobbit] All network test suddenly purple
On Tue, 2006-12-05 at 18:09 +0200, Strandell, Ralf wrote:
It was neither the IPC resources. The output of IPCS looks similar before and after a restart/recovery. So, if it wasn't disk/memory/cpu/ipcs, then what? I can't build an enterprise class monitoring system that just shuts down without any known cause.
What kind of filesystem access does Hobbit need? I have a bad habit of
restricting access... Oh yes, I had chosen "paranoid" filesys security
just because it was possible.
Ah, the old "msec changed my permissions to something completely bizzare and now it won't run" problem.
I put hobbit clients in the adm and ntools groups, and change the permissions on /etc/mandriva-release and /var/log/messages to be able to be read by the adm group (in /etc/security/msec/perm.local). Also, for server, hobbitping has to be installed setuid (just like fping).
Hobbit server needs at least to be able to write to its subdirectories in /var/lib/hobbit/... Look for a corefile - if it crashed, there will be a core in /var/lib/hobbit/server/ or maybe a level deeper.
And check disk space: use the df command.
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Thanks!
I have now set the --dns=ip option for bbtest-net and now it works like a dream.
Hobbit seems to report a "cannot run fping" condition as a "no icmp echo reply". That's a bit misleading as the problem was not network related. Not a big problem though - now that I know it.
Ps. Hobbit is a wonderfull piece of code. Much much better than BigBrother ever was.
-----Original Message----- From: Hubbard, Greg L [mailto:greg.hubbard at eds.com] Sent: Thursday, December 07, 2006 5:35 PM To: hobbit at hswn.dk Subject: RE: [hobbit] All network test suddenly purple
"It fails to ping hostnames."
I have seen the network test stall out if there is a DNS problem, even if the "testip" flag is set. If you turn on debug for the bb-network test (I forget the exact name, but you can look it up in the man pages) then you can see if the tests are getting stuck on a reverse lookup.
Even though the Ares code is supposed to time out if a DNS server does not respond, this seems to be wishful thinking in the documentation. My system was a Sun Solaris system.
Don't know how many hosts you are monitoring, but if you were to add some to your hobbit server /etc/hosts file, you might be able to see if this is a DNS issue.
I guess you could check your resolver configuration (/etc/resolv.conf?) and whatever it is on your system that determines precedence for naming services (on Solaris it is /etc/nsswitch.conf, not sure about your brand of Linux).
A shot in the dark, but based on something that bedeviled me for a couple of days...
GLH
-----Original Message----- From: Strandell, Ralf [mailto:Ralf.Strandell at silja.com] Sent: Thursday, December 07, 2006 9:13 AM To: hobbit at hswn.dk Subject: RE: [hobbit] All network test suddenly purple
No, unfortunately not.
I have changed the permissions to "easy" in Yast (Suse Linux), and the network test still performs one poll only.
Hobbitping is suid root. Fping is ran with sudo, and sudo is suid root. Hobbitping works: User "hobbit" can use it to ping ip addresses. It fails to ping hostnames. The /home/hobbit and /var/log/hobbit directory trees is writable. The bbtest page does not update either - it really looks like the network test would not run at all.
Is there a way to debug the network test to find out what is going on?
-----Original Message----- From: Daniel J McDonald [mailto:dan.mcdonald at austinenergy.com] Sent: Tuesday, December 05, 2006 7:28 PM To: hobbit at hswn.dk Subject: RE: [hobbit] All network test suddenly purple
On Tue, 2006-12-05 at 18:09 +0200, Strandell, Ralf wrote:
It was neither the IPC resources. The output of IPCS looks similar before and after a restart/recovery. So, if it wasn't disk/memory/cpu/ipcs, then what? I can't build an enterprise class monitoring system that just shuts down without any known cause.
What kind of filesystem access does Hobbit need? I have a bad habit of
restricting access... Oh yes, I had chosen "paranoid" filesys security
just because it was possible.
Ah, the old "msec changed my permissions to something completely bizzare and now it won't run" problem.
I put hobbit clients in the adm and ntools groups, and change the permissions on /etc/mandriva-release and /var/log/messages to be able to be read by the adm group (in /etc/security/msec/perm.local). Also, for server, hobbitping has to be installed setuid (just like fping).
Hobbit server needs at least to be able to write to its subdirectories in /var/lib/hobbit/... Look for a corefile - if it crashed, there will be a core in /var/lib/hobbit/server/ or maybe a level deeper.
And check disk space: use the df command.
-- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Well, there is 15 GB of free storage space... Enough free memory and swap. Load is zero. There are not too many processes.
Mem: 646492k total, 314756k used, 331736k free, 51164k buffers Swap: 1052216k total, 0k used, 1052216k free, 220304k cached
Hobbit has write permission to its home directory and in /tmp, and nowhere else.
-----Original Message----- From: Buchan Milne [mailto:bgmilne at staff.telkomsa.net] Sent: Tuesday, December 05, 2006 4:56 PM To: hobbit at hswn.dk Cc: Strandell, Ralf Subject: Re: [hobbit] All network test suddenly purple
On Tuesday 05 December 2006 14:39, Strandell, Ralf wrote:
Hi
I am doing a migration from BigBrother to Hobbit. I have installed Hobbit on a new dedicated Linux server and configured the bb-hosts and
started Hobbit. First everything seemed to work. I had nice results from ping, http and dns tests. After that I did not even touch the Hobbit system. Then it all suddenly stopped. All my network tests are purple. Bb-network.log is zero bytes. The ONLY error message that I can find is "df: cannot read table of mounted file systems: Permission denied" and
that is not a problem. There's also "hobbitd status-board not available" in the Hobbit server log. I guess something must be wrong with bbtest-net. Any ideas?
Well, this can happen when the hobbit server runs out of disk space ...
-- Buchan Milne ISP Systems Specialist - Monitoring/Authentication Team Leader B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)
Last time I had this problem it was because of a problem with a DNS server. Even though Hobbit is supposed to be immune to DNS issues, it isn't really bulletproof.
GLH
From: Strandell, Ralf [mailto:Ralf.Strandell at silja.com]
Sent: Tuesday, December 05, 2006 6:39 AM
To: hobbit at hswn.dk
Subject: [hobbit] All network test suddenly purple
Hi
I am doing a migration from BigBrother to Hobbit. I have
installed Hobbit on a new dedicated Linux server and configured the bb-hosts and started Hobbit. First everything seemed to work. I had nice results from ping, http and dns tests. After that I did not even touch the Hobbit system. Then it all suddenly stopped. All my network tests are purple. Bb-network.log is zero bytes. The ONLY error message that I can find is "df: cannot read table of mounted file systems: Permission denied" and that is not a problem. There's also "hobbitd status-board not available" in the Hobbit server log. I guess something must be wrong with bbtest-net. Any ideas?
PID PPID STIME ELAPSED COMMAND
18428 1 Nov30 5-03:06:20
/home/hobbit/server/bin/hobbitlaunch --config=/home/hobbit/server/etc/hobbitlaunch.cfg --env=/h
18429 18428 Nov30 5-03:06:20 hobbitd
--pidfile=/var/log/hobbit/hobbitd.pid --restart=/home/hobbit/server/tmp/hobbitd.chk --c
18430 18428 Nov30 5-03:06:15 hobbitd_channel --channel=stachg
--log=/var/log/hobbit/history.log hobbitd_history 18431 18430 Nov30 5-03:06:15 hobbitd_history 18432 18428 Nov30 5-03:06:15 hobbitd_channel --channel=clichg --log=/var/log/hobbit/hostdata.log hobbitd_hostdata 18433 18432 Nov30 5-03:06:15 hobbitd_hostdata 18434 18428 Nov30 5-03:06:15 hobbitd_channel --channel=page --log=/var/log/hobbit/page.log hobbitd_alert --checkpoint-file=/
18435 18434 Nov30 5-03:06:15 hobbitd_alert
--checkpoint-file=/home/hobbit/server/tmp/alert.chk --checkpoint-interval=600
18436 18428 Nov30 5-03:06:15 hobbitd_channel --channel=status
--log=/var/log/hobbit/rrd-status.log hobbitd_rrd --rrddir=/hom
18437 18428 Nov30 5-03:06:15 hobbitd_channel --channel=data
--log=/var/log/hobbit/rrd-data.log hobbitd_rrd --rrddir=/home/ho
18438 18436 Nov30 5-03:06:15 hobbitd_rrd
--rrddir=/home/hobbit/data/rrd 18439 18437 Nov30 5-03:06:15 hobbitd_rrd --rrddir=/home/hobbit/data/rrd 18440 18428 Nov30 5-03:06:15 hobbitd_channel --channel=client --log=/var/log/hobbit/clientdata.log hobbitd_client 18441 18440 Nov30 5-03:06:15 hobbitd_client 26129 18428 Dec04 22:35:09 bbtest-net --report --ping --checkresponse 9286 1 14:36 01:24 sh -c vmstat 300 2 1>/home/hobbit/client/tmp/hobbit_vmstat.slfitkusap007.9269 2>&1; mv /home/ho
9287 9286 14:36 01:24 vmstat 300 2
participants (5)
-
bgmilne@staff.telkomsa.net
-
dan.mcdonald@austinenergy.com
-
greg.hubbard@eds.com
-
JasonAS_Jones@mentor.com
-
Ralf.Strandell@silja.com