Henrik,
Since you seem to be a fan of 'parse evertyhing on the server'
have you ever considered adding the ability to have hobbit use 'agentless clients" ?
By agentless, I mean only the trust relationship between the
server and client is established, commands and output are done 'on-the-fly', every 5 minutes.
After writing the VPN/sshbb/email I was thinkin it could be pretty
easy to add this 'functionality' native to hobbit.
-- Scott Walters -PacketPusher
Hi Scott,
On Wed, Jan 04, 2006 at 12:45:05PM -0500, Scott Walters wrote:
Since you seem to be a fan of 'parse evertyhing on the server' have you ever considered adding the ability to have hobbit use 'agentless clients" ?
By agentless, I mean only the trust relationship between the server and client is established, commands and output are done 'on-the-fly', every 5 minutes.
After writing the VPN/sshbb/email I was thinkin it could be pretty easy to add this 'functionality' native to hobbit.
I guess it would, and when I wrote the Hobbit client I did it with this in mind. It would be fairly trivial to provide a wrapper for the client side scripts that runs them through a VPN or SSH tunnel.
It's not something that I plan on using myself, but it could easily be a common add-on to Hobbit. I do prefer to run the clients locally on the servers because that scales much better than having one server pull all of this data into Hobbit. Parsing the data doesn't take nearly as long as collecting it...
Regards, Henrik
On Wed, 2006-01-04 at 23:49 +0100, Henrik Stoerner wrote:
Hi Scott,
On Wed, Jan 04, 2006 at 12:45:05PM -0500, Scott Walters wrote:
After writing the VPN/sshbb/email I was thinkin it could be pretty easy to add this 'functionality' native to hobbit.
I guess it would, and when I wrote the Hobbit client I did it with this in mind. It would be fairly trivial to provide a wrapper for the client side scripts that runs them through a VPN or SSH tunnel.
This would be very useful to me - "hobbit central" ala bb-central.
It's not something that I plan on using myself, but it could easily be a common add-on to Hobbit. I do prefer to run the clients locally on the servers because that scales much better than having one server pull all of this data into Hobbit. Parsing the data doesn't take nearly as long as collecting it...
But having one or two servers that poll all of the others does scale well, because you don't have to install (and upgrade) hobbit clients on a hundred machines - just set up an rsa key and you are done. If the primary hobbit display/alarm/parse work is too much with the polling added, just use a second hobbit server for polling/parsing and feed the results to the display server...
-- Daniel J McDonald, CCIE # 2495, CNX, CISSP # 78281 Austin Energy dan.mcdonald at austinenergy.com
gpg Key: http://austinnetworkdesign.com/pgp.key Key fingerprint = B527 F53D 0C8C D38B DCC7 901D 2F19 A13A 22E8 A76A
Daniel J McDonald wrote:
But having one or two servers that poll all of the others does scale well, because you don't have to install (and upgrade) hobbit clients on a hundred machines - just set up an rsa key and you are done. If the primary hobbit display/alarm/parse work is too much with the polling added, just use a second hobbit server for polling/parsing and feed the results to the display server...
I disagree. The distributed system scales much better, as the remote servers are sending in their results in parallel. While fping is able to test remote hosts in parallel, the other test are done in serial (bb-fetch, etc).
Lets say you have 1000 hosts. Lets then just for fun pretend that it will only take 1 second to log into the remote hosts, run several tests, and receive the result (it would actually take a bit longer than that).
1000 seconds (hosts) / 60 (minutes) = 16.666 minutes to poll those hosts!
So then you can say oh well just have 2-3 hobbit servers doing the polling then. Now you have 3 hobbit servers to deail with, monitoring them, upgrading them, etc.
Now lets look at a *real world* example of how long it takes to ssh in and execute a command:
[hobbit at hobbit ~]$ time ssh myhost.net df -h Filesystem size used avail capacity Mounted on /dev/dsk/c5t0d0s0 30G 10G 20G 35% / /devices 0K 0K 0K 0% /devices ctfs 0K 0K 0K 0% /system/contract proc 0K 0K 0K 0% /proc mnttab 0K 0K 0K 0% /etc/mnttab swap 6.6G 1000K 6.6G 1% /etc/svc/volatile objfs 0K 0K 0K 0% /system/object fd 0K 0K 0K 0% /dev/fd swap 6.6G 16K 6.6G 1% /tmp swap 6.6G 32K 6.6G 1% /var/run /dev/md/dsk/d0 639G 116G 523G 19% /raid /dev/md/dsk/d1 807G 504G 304G 63% /raid2
*real 0m1.912s* user 0m0.022s sys 0m0.008s
Almost 2 seconds there....and just for one command. So now even 2 hobbit servers polling simultaneously will still take over 15 minutes just to poll 1000 servers. Having hobbit do the ssh's in parallel wouldn't work either, I have tried something similar on far fewer hosts, and even using -c blowfish option the server CPU still hit 100% from all the overhead.
The way that I get around this is to have bbproxy running on a DMZ host, and have the hobbit/bb clients configured to use the bbproxy IP as their BBDISPLAY, whcih then forwards the traffic out of the DMZ to my hobbit server. Not 100% secure, but using bb-fetch isn't either (an attacker could compromise one of the remote servers, and modify one of the commands that the hobbit user executes, thus giving them the ability to communicate with the hobbit server, injecting something to break the parsing engine, buffer overflows, etc). I will stop talking about that now as I am getting off subject :)
I agree that having similar functionality to bb-fetch could be useful for a *few* remote/DMZ hosts, but it certainly doesn't scale well. Once you reach a number of hosts whose polling time exceeds the hobbit refresh interval you are done. I know it would be "nice" if we didn't have to upgrade remote clients and maintain them, but your solution involves ssh keys, so just use those same keys and a script to roll out the updates :)
-Charles
On Thursday 05 January 2006 18:21, Charles Jones wrote:
Lets say you have 1000 hosts. Lets then just for fun pretend that it will only take 1 second to log into the remote hosts, run several tests, and receive the result (it would actually take a bit longer than that).
1000 seconds (hosts) / 60 (minutes) = 16.666 minutes to poll those hosts!
So then you can say oh well just have 2-3 hobbit servers doing the polling then. Now you have 3 hobbit servers to deail with, monitoring them, upgrading them, etc.
Now lets look at a *real world* example of how long it takes to ssh in and execute a command:
[hobbit at hobbit ~]$ time ssh myhost.net df -h Filesystem size used avail capacity Mounted on /dev/dsk/c5t0d0s0 30G 10G 20G 35% / /devices 0K 0K 0K 0% /devices ctfs 0K 0K 0K 0% /system/contract proc 0K 0K 0K 0% /proc mnttab 0K 0K 0K 0% /etc/mnttab swap 6.6G 1000K 6.6G 1% /etc/svc/volatile objfs 0K 0K 0K 0% /system/object fd 0K 0K 0K 0% /dev/fd swap 6.6G 16K 6.6G 1% /tmp swap 6.6G 32K 6.6G 1% /var/run /dev/md/dsk/d0 639G 116G 523G 19% /raid /dev/md/dsk/d1 807G 504G 304G 63% /raid2
*real 0m1.912s* user 0m0.022s sys 0m0.008s
Almost 2 seconds there....and just for one command.
I keep wondering how long the equivalent snmp query takes ... or in fact gathering all the data asynchronously via snmp ...
Regards, Buchan
-- Buchan Milne ISP Systems Specialist B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)
On Thu, Jan 05, 2006 at 07:02:52PM +0200, Buchan Milne wrote:
On Thursday 05 January 2006 18:21, Charles Jones wrote:
*real 0m1.912s* user 0m0.022s sys 0m0.008s
Almost 2 seconds there....and just for one command.
I keep wondering how long the equivalent snmp query takes ... or in fact gathering all the data asynchronously via snmp ...
Let's see ... Net-SNMP includes an "snmpdf" command:
$ time snmpdf -v 1 -c somepassword somehost Description size (kB) Used Available Used% / 5969124 754896 5214228 12%
real 0m0.201s user 0m0.154s sys 0m0.019s
$ time ssh somehost df Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 5969124 754896 4911004 14% /
real 0m0.535s user 0m0.011s sys 0m0.004s
So: 0.2 seconds for snmp, 0.5 seconds for ssh. (No, I don't know why they calculate the available disk size differently - df probably leaves out the 5% filesystem space that is reserved for "root" use).
SNMP probably wins because it is UDP based, so you avoid a lot of overhead from the TCP connection setup. Plus the SNMP daemon is running, so it doesn't need to start a new process to respond.
But I do agree with Charles - using a "bbfetch" style method of pulling data from clients to the server only works for a small number of hosts. On the scale that I work with on a daily basis - 2000 hosts or more - it is simply not practical to contact all hosts every 5 minutes.
I'm still willing to implement the agent-less data collection in Hobbit, because sometimes that is just going to be the only way you can get information about a server. So it is an OK way of doing this, if you know what it should - and should not - be used for.
(BTW, the issue that was raised about updates being easier if you don't have to deploy them on all clients - that's a non-issue. It boils down to whether or not you have an (automated) procedure for software and patch distribution - if you don't have that, then you're in trouble no matter how Hobbit collects data.)
Regards, Henrik
On Thu, 5 Jan 2006, Buchan Milne wrote:
Almost 2 seconds there....and just for one command.
I keep wondering how long the equivalent snmp query takes ... or in fact gathering all the data asynchronously via snmp ...
I've always thought that SNMP would be great for UNIX. But everytime I have ever tried to really make it work, it just doesn't. snmp agent issues, OID issues, polling issues, UDP network issues.
My previous experiences have been so heinous I can easily recognize my own bias against SNMP for unix at this point.
But what is wierd is it works so well for comm/network equipment.
-- Scott Walters -PacketPusher
On Thu, 5 Jan 2006, Charles Jones wrote:
I disagree. The distributed system scales much better, as the remote servers are sending in their results in parallel.
I think we could architect the agentless solution to run in parallel, or some sort of asynch scheduler/threads.
Lets say you have 1000 hosts. Lets then just for fun pretend that it will only take 1 second to log into the remote hosts, run several tests, and receive the result (it would actually take a bit longer than that). 1000 seconds (hosts) / 60 (minutes) = 16.666 minutes to poll those hosts!
1 sec is *very* optitmisic ;) So your point is very clear that a generic serial "for host in x y z" would not scale at all.
either, I have tried something similar on far fewer hosts, and even using -c blowfish option the server CPU still hit 100% from all the overhead.
I've found the -c blowfish only helps when you are pushing a lot of data around (ufsdump 0fc - | ssh -c blowfish).
commands that the hobbit user executes, thus giving them the ability to communicate with the hobbit server, injecting something to break the parsing engine, buffer overflows, etc). I will stop talking about that now as I am getting off subject :)
If that's the easiest way to get into your network, you get a gold star ;)
I agree that having similar functionality to bb-fetch could be useful for a *few* remote/DMZ hosts, but it certainly doesn't scale well. Once you reach a number of hosts whose polling time exceeds the hobbit refresh interval you are done. I know it would be "nice" if we didn't have to upgrade remote clients and maintain them, but your solution involves ssh keys, so just use those same keys and a script to roll out the updates :)
True, and I am not sold on the agentless clients idea either, but we've got such a great framework to try it on.
The first design decision in my mind would be if in agentless we mean
install/run/uninstall the hobbit client every 5m, i know this sounds horribly inefficient but I am attracted to the simplicity of agent and agentless machines being the 'same'. or just automagically install the client if the trust exists . . . . .
only running the exact OS commands necessary and capturing the output. This would require some new code on the server. But if done right, it could perhaps replace existing clients.
abstract the collection from the delivery . . . .
-- Scott Walters -PacketPusher
On Thu, Jan 05, 2006 at 01:46:36PM -0500, Scott Walters wrote:
I agree that having similar functionality to bb-fetch could be useful for a *few* remote/DMZ hosts, but it certainly doesn't scale well.
True, and I am not sold on the agentless clients idea either, but we've got such a great framework to try it on.
The first design decision in my mind would be if in agentless we mean
install/run/uninstall the hobbit client every 5m, i know this sounds horribly inefficient but I am attracted to the simplicity of agent and agentless machines being the 'same'. or just automagically install the client if the trust exists . . . . .
only running the exact OS commands necessary and capturing the output. This would require some new code on the server. But if done right, it could perhaps replace existing clients.
abstract the collection from the delivery . . . .
You can easily combine 1) and 2). Running something like this on the client-polling server would do it:
CLIENT="www.foo.com" CLIENTOS="linux" echo "client $CLIENT.$CLIENTOS" >/tmp/msg-$CLIENT.txt ssh $CLIENT < ~$BBHOME/bin/hobbitclient-$CLIENTOS.sh >>/tmp/msg-$CLIENT.txt $BB $BBDISP "@" </tmp/msg-$CLIENT.txt
would run the normal client-side scripts without having them installed on each client box, and send the output to the Hobbit server in a normal "client" message. There is an issue with the OS'es that need special tools installed (usually to collect the memory statistics), but that can be worked around.
Regards, Henrik
You can easily combine 1) and 2). Running something like this on the client-polling server would do it:
Wow, I should have known the hobbit client would have been this clean already ;)
Real world:
[hobbit at myhost bin]# cat hobbitremote.sh #!/bin/sh -x
CLIENT="myclient" CLIENTOS="aix"
echo "client $CLIENT.$CLIENTOS" >/tmp/msg-$CLIENT.txt ssh bb@$CLIENT < /usr/local/hobbit/client/bin/hobbitclient-$CLIENTOS.sh
/tmp/msg-$CLIENT.txt #$BB $BBDISP "@" </tmp/msg-$CLIENT.txt
[hobbit at dev1 bin]# time ./hobbitremote.sh CLIENT=aixserver CLIENTOS=aix
- echo client aixserver.aix
- ssh bb at aixserver Pseudo-terminal will not be allocated because stdin is not a terminal. ksh[39]: top: not found.
real 0m7.525s user 0m0.120s sys 0m0.010s
Seven seconds. But I am pretty sure 5 of that is the ps of the machine.
Needless to say, this would need to be parallel processed to handle scaling, but its awesome how easy the proof of concept was.
-- Scott Walters -PacketPusher
participants (5)
-
bgmilne@staff.telkomsa.net
-
dan.mcdonald@austinenergy.com
-
henrik@hswn.dk
-
jonescr@cisco.com
-
scott@PacketPushers.com