Hi all
Henrik, I think this might be one for you. My Hobbit server crashed and died.
This happened before, a few months ago, and I shrugged it off - sometimes sh1t happens. Then it happened last week again. This time I was concerned. Now it has just happened again, about 40 minutes ago.
I tried to restart hobbit, without much luck, then I walked away, put my son into bed, and then tried again. This time it worked.
The logs never showed anything conclusive, but maybe I just don't know what I am looking for.
The symptoms were the same all three times. All "passive" server based tests go purple. By passive server based, I mean conn, http, content, ssh, ftp, ftps, etc. The tests that do not rely on a client. Also went purple, was bbd and bbtest.
All client based tests were unaffected. Graphing worked as normal. And alerts were being sent out.
I am running 4.2 with all-in-one patch, the Sun if-config patch, the bbwin update and the devmon update.
Has anybody seen anything like this before? Anybody got any tips on where to look for the problem, or how to diagnose this?
Regards Vernon
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
In <A3D12FAD74FC8B46991703F40C182BAB01078343 at permls102.wde.woodside.com.au> "Everett, Vernon" <Vernon.Everett at woodside.com.au> writes:
My Hobbit server crashed and died.
This happened before, a few months ago, and I shrugged it off - sometimes sh1t happens. Then it happened last week again. This time I was concerned. Now it has just happened again, about 40 minutes ago.
I tried to restart hobbit, without much luck, then I walked away, put my son into bed, and then tried again. This time it worked.
The logs never showed anything conclusive, but maybe I just don't know what I am looking for.
The symptoms were the same all three times. All "passive" server based tests go purple. By passive server based, I mean conn, http, content, ssh, ftp, ftps, etc. The tests that do not rely on a client. Also went purple, was bbd and bbtest.
All client based tests were unaffected. Graphing worked as normal. And alerts were being sent out.
Your description sounds very much as if the only thing that stopped were the network tests (bbtest-net). Since the client-side tests are updating, network tests go purple and alerts go out, I think that is where the problem is. "bbtest" going purple also points in this direction.
Next time it happens, see if there's a "bbtest-net" process running (and possible a "hobbitping" or "fping" process as well); if there is, kill it with a "kill -6" to make it dump core. Then do the usual stuff of getting a stacktrace from the core file ( http://www.hswn.dk/hobbit/help/known-issues.html#bugreport )
Are you running bbtest-net with the "--no-ares" option ? Then a hung/slow DNS server can make your network tests run very slowly.
Henrik
Hmm, that is what I suspected, because I found this in the log file after sending my mail This might be something conclusive, if I had even the foggiest idea what it meant. Hoping some of the smarter list members can assist.
This log entry is not time-stamped, but it was the last entry before I did the restart.
From hobbitlaunch.cfg
[bbnet] ENVFILE /usr/lib/hobbit/server/etc/hobbitserver.cfg NEEDS hobbitd CMD bbtest-net --report --ping --checkresponse LOGFILE $BBSERVERLOGS/bb-network.log INTERVAL 5m
From bb-network.log *** glibc detected *** bbtest-net: double free or corruption (out): 0x000000000a96dd20 *** ======= Backtrace: ========= /lib64/libc.so.6[0x3db7a71634] /lib64/libc.so.6(cfree+0x8c)[0x3db7a74c5c] bbtest-net[0x42493a] bbtest-net[0x422bdf] bbtest-net[0x422d7e] bbtest-net[0x40f7d7] bbtest-net[0x4076cc] bbtest-net[0x4088c6] /lib64/libc.so.6(__libc_start_main+0xf4)[0x3db7a1d8b4] bbtest-net[0x4039e9] ======= Memory map: ======== 00400000-00430000 r-xp 00000000 fd:01 389432 /usr/lib/hobbit/server/bin/bbtest-net 00630000-00631000 rw-p 00030000 fd:01 389432 /usr/lib/hobbit/server/bin/bbtest-net 00631000-00637000 rw-p 00631000 00:00 0 00830000-00832000 rw-p 00030000 fd:01 389432 /usr/lib/hobbit/server/bin/bbtest-net 0a8d1000-0aa1b000 rw-p 0a8d1000 00:00 0 3224600000-3224638000 r-xp 00000000 fd:01 68061 /usr/lib64/libldap-2.3.so.0.2.15 3224638000-3224838000 ---p 00038000 fd:01 68061 /usr/lib64/libldap-2.3.so.0.2.15 3224838000-322483a000 rw-p 00038000 fd:01 68061 /usr/lib64/libldap-2.3.so.0.2.15 3366400000-3366443000 r-xp 00000000 fd:00 75879 /lib64/libssl.so.0.9.8b 3366443000-3366643000 ---p 00043000 fd:00 75879 /lib64/libssl.so.0.9.8b 3366643000-3366649000 rw-p 00043000 fd:00 75879 /lib64/libssl.so.0.9.8b 3368000000-3368125000 r-xp 00000000 fd:00 75876 /lib64/libcrypto.so.0.9.8b 3368125000-3368325000 ---p 00125000 fd:00 75876 /lib64/libcrypto.so.0.9.8b 3368325000-3368344000 rw-p 00125000 fd:00 75876 /lib64/libcrypto.so.0.9.8b 3368344000-3368348000 rw-p 3368344000 00:00 0 385c600000-385c63b000 r-xp 00000000 fd:00 75779 /lib64/libsepol.so.1 385c63b000-385c83b000 ---p 0003b000 fd:00 75779 /lib64/libsepol.so.1 385c83b000-385c83c000 rw-p 0003b000 fd:00 75779 /lib64/libsepol.so.1 385c83c000-385c846000 rw-p 385c83c000 00:00 0 385ca00000-385ca15000 r-xp 00000000 fd:00 75786 /lib64/libselinux.so.1 385ca15000-385cc15000 ---p 00015000 fd:00 75786 /lib64/libselinux.so.1 385cc15000-385cc17000 rw-p 00015000 fd:00 75786 /lib64/libselinux.so.1 385cc17000-385cc18000 rw-p 385cc17000 00:00 0 385ce00000-385ce8f000 r-xp 00000000 fd:01 68057 /usr/lib64/libkrb5.so.3.3 385ce8f000-385d08e000 ---p 0008f000 fd:01 68057 /usr/lib64/libkrb5.so.3.3 385d08e000-385d092000 rw-p 0008e000 fd:01 68057 /usr/lib64/libkrb5.so.3.3 385d600000-385d608000 r-xp 00000000 fd:01 68055 /usr/lib64/libkrb5support.so.0.1 385d608000-385d807000 ---p 00008000 fd:01 68055 /usr/lib64/libkrb5support.so.0.1 385d807000-385d808000 rw-p 00007000 fd:01 68055 /usr/lib64/libkrb5support.so.0.1 385da00000-385da24000 r-xp 00000000 fd:01 68056 /usr/lib64/libk5crypto.so.3.1 385da24000-385dc23000 ---p 00024000 fd:01 68056 /usr/lib64/libk5crypto.so.3.1 385dc23000-385dc25000 rw-p 00023000 fd:01 68056 /usr/lib64/libk5crypto.so.3.1 385de00000-385de2c000 r-xp 00000000 fd:01 68058 /usr/lib64/libgssapi_krb5.so.2.2 385de2c000-385e02c000 ---p 0002c000 fd:01 68058 /usr/lib64/libgssapi_krb5.so.2.2 385e02c000-385e02e000 rw-p 0002c000 fd:01 68058 /usr/lib64/libgssapi_krb5.so.2.2 3af4400000-3af440d000 r-xp 00000000 fd:01 68095 /usr/lib64/liblber-2.3.so.0.2.15 3af440d000-3af460d000 ---p 0000d000 fd:01 68095 /usr/lib64/liblber-2.3.so.0.2.15 3af460d000-3af460e000 rw-p 0000d000 fd:01 68095 /usr/lib64/liblber-2.3.so.0.2.15 3db7600000-3db761a000 r-xp 00000000 fd:00 75791 /lib64/ld-2.5.so 3db781a000-3db781b000 r--p 0001a000 fd:00 75791 /lib64/ld-2.5.so 3db781b000-3db781c000 rw-p 0001b000 fd:00 75791 /lib64/ld-2.5.so 3db7a00000-3db7b4a000 r-xp 00000000 fd:00 75797 /lib64/libc-2.5.so 3db7b4a000-3db7d49000 ---p 0014a000 fd:00 75797 /lib64/libc-2.5.so 3db7d49000-3db7d4d000 r--p 00149000 fd:00 75797 /lib64/libc-2.5.so 3db7d4d000-3db7d4e000 rw-p 0014d000 fd:00 75797 /lib64/libc-2.5.so 3db7d4e000-3db7d53000 rw-p 3db7d4e000 00:00 0 3db7e00000-3db7e02000 r-xp 00000000 fd:00 75840 /lib64/libdl-2.5.so 3db7e02000-3db8002000 ---p 00002000 fd:00 75840 /lib64/libdl-2.5.so 3db8002000-3db8003000 r--p 00002000 fd:00 75840 /lib64/libdl-2.5.so 3db8003000-3db8004000 rw-p 00003000 fd:00 75840 /lib64/libdl-2.5.so 3db8200000-3db8218000 r-xp 00000000 fd:01 67958 /usr/lib64/libsasl2.so.2.0.22 3db8218000-3db8418000 ---p 00018000 fd:01 67958 /usr/lib64/libsasl2.so.2.0.22 3db8418000-3db8419000 rw-p 00018000 fd:01 67958 /usr/lib64/libsasl2.so.2.0.22 3db8600000-3db8614000 r-xp 00000000 fd:01 67257 /usr/lib64/libz.so.1.2.3 3db8614000-3db8813000 ---p 00014000 fd:01 67257 /usr/lib64/libz.so.1.2.3 3db8813000-3db8814000 rw-p 00013000 fd:01 67257 /usr/lib64/libz.so.1.2.3 3db9a00000-3db9a09000 r-xp 00000000 fd:00 76086 /lib64/libcrypt-2.5.so 3db9a09000-3db9c08000 ---p 00009000 fd:00 76086 /lib64/libcrypt-2.5.so 3db9c08000-3db9c09000 r--p 00008000 fd:00 76086 /lib64/libcrypt-2.5.so 3db9c09000-3db9c0a000 rw-p 00009000 fd:00 76086 /lib64/libcrypt-2.5.so 3db9c0a000-3db9c38000 rw-p 3db9c0a000 00:00 0 3dba600000-3dba611000 r-xp 00000000 fd:00 76082 /lib64/libresolv-2.5.so 3dba611000-3dba811000 ---p 00011000 fd:00 76082 /lib64/libresolv-2.5.so 3dba811000-3dba812000 r--p 00011000 fd:00 76082 /lib64/libresolv-2.5.so 3dba812000-3dba813000 rw-p 00012000 fd:00 76082 /lib64/libresolv-2.5.so 3dba813000-3dba815000 rw-p 3dba813000 00:00 0 3dbaa00000-3dbaa02000 r-xp 00000000 fd:00 76083 /lib64/libcom_err.so.2.1 3dbaa02000-3dbac01000 ---p 00002000 fd:00 76083 /lib64/libcom_err.so.2.1 3dbac01000-3dbac02000 rw-p 00001000 fd:00 76083 /lib64/libcom_err.so.2.1 3dbba00000-3dbba02000 r-xp 00000000 fd:00 76080 /lib64/libkeyutils-1.2.so 3dbba02000-3dbbc01000 ---p 00002000 fd:00 76080 /lib64/libkeyutils-1.2.so 3dbbc01000-3dbbc02000 rw-p 00001000 fd:00 76080 /lib64/libkeyutils-1.2.so 3dbc200000-3dbc20d000 r-xp 00000000 fd:00 75803 /lib64/libgcc_s-4.1.2-20080102.so.1 3dbc20d000-3dbc40d000 ---p 0000d000 fd:00 75803 /lib64/libgcc_s-4.1.2-20080102.so.1 3dbc40d000-3dbc40e000 rw-p 0000d000 fd:00 75803 /lib64/libgcc_s-4.1.2-20080102.so.1 2b224ebb4000-2b224ebb6000 rw-p 2b224ebb4000 00:00 0 2b224ebc1000-2b224ebc9000 rw-p 2b224ebc1000 00:00 0 2b224ebc9000-2b224ebd3000 r-xp 00000000 fd:00 75873 /lib64/libnss_files-2.5.so 2b224ebd3000-2b224edd2000 ---p 0000a000 fd:00 75873 /lib64/libnss_files-2.5.so 2b224edd2000-2b224edd3000 r--p 00009000 fd:00 75873 /lib64/libnss_files-2.5.so 2b224edd3000-2b224edd4000 rw-p 0000a000 fd:00 75873 /lib64/libnss_files-2.5.so 2b2250000000-2b2250021000 rw-p 2b2250000000 00:00 0 2b2250021000-2b2254000000 ---p 2b2250021000 00:00 0 7fff5bee0000-7fff5bef6000 rw-p 7fff5bee0000 00:00 0 [stack] ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso]
-----Original Message----- From: Henrik Stoerner [mailto:henrik at hswn.dk] Sent: Thursday, 9 October 2008 9:18 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Hobbit server crashing
In <A3D12FAD74FC8B46991703F40C182BAB01078343 at permls102.wde.woodside.com.au> "Everett, Vernon" <Vernon.Everett at woodside.com.au> writes:
My Hobbit server crashed and died.
This happened before, a few months ago, and I shrugged it off - sometimes sh1t happens. Then it happened last week again. This time I was concerned. Now it has just happened again, about 40 minutes ago.
I tried to restart hobbit, without much luck, then I walked away, put my son into bed, and then tried again. This time it worked.
The logs never showed anything conclusive, but maybe I just don't know what I am looking for.
The symptoms were the same all three times. All "passive" server based tests go purple. By passive server based, I mean conn, http, content, ssh, ftp, ftps, etc. The tests that do not rely on a client. Also went purple, was bbd and bbtest.
All client based tests were unaffected. Graphing worked as normal. And alerts were being sent out.
Your description sounds very much as if the only thing that stopped were the network tests (bbtest-net). Since the client-side tests are updating, network tests go purple and alerts go out, I think that is where the problem is. "bbtest" going purple also points in this direction.
Next time it happens, see if there's a "bbtest-net" process running (and possible a "hobbitping" or "fping" process as well); if there is, kill it with a "kill -6" to make it dump core. Then do the usual stuff of getting a stacktrace from the core file ( http://www.hswn.dk/hobbit/help/known-issues.html#bugreport )
Are you running bbtest-net with the "--no-ares" option ? Then a hung/slow DNS server can make your network tests run very slowly.
Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
In <A3D12FAD74FC8B46991703F40C182BAB01078345 at permls102.wde.woodside.com.au> "Everett, Vernon" <Vernon.Everett at woodside.com.au> writes:
Hmm, that is what I suspected, because I found this in the log file after s= ending my mail
*** glibc detected *** bbtest-net: double free or corruption (out): 0x00000= 0000a96dd20 *** =3D=3D=3D=3D=3D=3D=3D Backtrace: =3D=3D=3D=3D=3D=3D=3D=3D=3D /lib64/libc.so.6[0x3db7a71634] /lib64/libc.so.6(cfree+0x8c)[0x3db7a74c5c] bbtest-net[0x42493a] bbtest-net[0x422bdf] bbtest-net[0x422d7e] bbtest-net[0x40f7d7] bbtest-net[0x4076cc] bbtest-net[0x4088c6]
It would be really interesting to find out what these adresses correspond to in the source code. If you have gdb on this host, could you run gdb ~hobbit/server/bin/bbtest-net then at the "(gdb)" prompt enter the command (gdb) l *0x42493a
Hopefully that gives you something like
henrik at osiris:~/hobbit$ gdb ./bbnet/bbtest-net (gdb) l *0x8053ce7 0x8053ce7 is in bbgen_ASN1_UTCTIME (contest.c:392).
If it does, it would be nice to see.
Henrik
It would be really interesting to find out what these adresses correspond to in the source code. If you have gdb on this host, could you run gdb ~hobbit/server/bin/bbtest-net then at the "(gdb)" prompt enter the command (gdb) l *0x42493a
Hopefully that gives you something like
henrik at osiris:~/hobbit$ gdb ./bbnet/bbtest-net (gdb) l *0x8053ce7 0x8053ce7 is in bbgen_ASN1_UTCTIME (contest.c:392).
If it does, it would be nice to see.
Henrik
That would be too easy. Will talk to my Red-Hat guys in the morning.
-bash-3.2$ gdb /usr/lib/hobbit/server/bin/bbtest-net GNU gdb Red Hat Linux (6.5-37.el5_2.1rh) Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging symbols found) Using host libthread_db library "/lib64/libthread_db.so.1".
(gdb) l *0x42493a No symbol table is loaded. Use the "file" command.
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
We haven't been putting the Windows Server msgs column on our bb2 page, nor alerting on msgs, because of the number of events that seem to trigger warnings or errors.
Would anyone be willing to share with is your event log filter entries for BBWIN, to use as a starting point?
TIA
Tom Kauffman
Hi Tom,
What you are talking about is a process that shapes the exceptions per server. I monitor all of our logs here and we have about 10 servers with different functions so on each client (I use BBWin and local configs) initially I looked at the logs and found the pointless errors and ignored those right away and I still sometimes have to adjust the list if applications or hardware changes. Here is an excerpt from 2 different configuration files to give you an example of what I mean:
Server1 <setting name="alwaysgreen" value="false" /> <match logfile="System" type="error" delay="1h" alarmcolor="red" /> <match logfile="Application" type="error" delay="1h" alarmcolor="red" /> <ignore logfile="System" eventid="1111" /> <ignore logfile="System" eventid="16" /> <ignore logfile="System" eventid="12294" /> <ignore logfile="System" eventid="5805" /> <ignore logfile="System" eventid="5723" /> <ignore logfile="Application" eventid="1000" /> <ignore logfile="Application" eventid="11" /> <ignore logfile="Application" eventid="34113" /> <ignore logfile="System" source="Backup Exec" />
Server2 <setting name="alwaysgreen" value="false" /> <match logfile="System" type="error" delay="1h" alarmcolor="red" /> <match logfile="Application" type="error" delay="1h" alarmcolor="red" /> <ignore logfile="System" eventid="1111" /> <ignore logfile="System" eventid="16" /> <ignore logfile="System" eventid="39" /> <ignore logfile="System" eventid="1106" /> <ignore logfile="System" eventid="61" /> <ignore logfile="Application" eventid="107" /> <ignore logfile="Application" eventid="1106" /> <ignore logfile="Application" eventid="1000" /> <ignore logfile="Application" eventid="101" /> <ignore logfile="Application" eventid="2002" /> <ignore logfile="Application" eventid="1015" />
As you can see the list is not huge and our servers are ones I inherited so they experience more weirdness then I'm used to with my own builds, but you can always find a useless errors in a windows event log. Hope this helps.
Thank You.
Rafal Roginela
-----Original Message----- From: Kauffman, Tom [mailto:KauffmanT at nibco.com] Sent: Thursday, October 09, 2008 10:31 AM To: hobbit at hswn.dk Subject: [hobbit] Looking for sample BBWIN configs for filtering Windows event logs
We haven't been putting the Windows Server msgs column on our bb2 page, nor alerting on msgs, because of the number of events that seem to trigger warnings or errors.
Would anyone be willing to share with is your event log filter entries for BBWIN, to use as a starting point?
TIA
Tom Kauffman
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
We put the column in and we let it alert, but we have the alerts file
set up so that a msgs alert never pages anybody... it just sends email.
We've commented a few things out but mostly we've looked into the things
that were complained about and fixed or shut off the source of the
complaint. It's helped us clean up the boxes quite a bit -- there were
services running that did not need to be.
Jon
Kauffman, Tom wrote:
We haven't been putting the Windows Server msgs column on our bb2 page, nor alerting on msgs, because of the number of events that seem to trigger warnings or errors.
Would anyone be willing to share with is your event log filter entries for BBWIN, to use as a starting point?
TIA
Tom Kauffman
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Here's our typical list:
<ignore logfile="System" eventid="2" />
<ignore logfile="System" eventid="3" />
<ignore logfile="System" eventid="4" />
<ignore logfile="System" eventid="8" />
<ignore logfile="System" eventid="1106" />
<ignore logfile="System" eventid="1111" />
<ignore logfile="Application" eventid="3033" />
<ignore logfile="Application" eventid="2003" />
ID 3033 is an Exchange message relating to Windows Mobile clients, but because Exchange was the first server I converted to BBWin from Big Brother, it's ended up on all of the systems. ID 2003 is related to performance counters. It's probably possible to fix, but my focus is not so much on the Windows infrastructure.
The rest are the annoying printer driver entries that you get when you log into a machine via Remote Desktop and are forwarding printers but don't have drivers on the system. I tried for a long time to get people to turn off printer forwarding, because I could never get Big Brother to stop alarming, but nobody listened. Hobbit/BBWin has been a lifesaver in this respect. With a little more work, we will be able to soon include the NOC in all alarms. With Big Brother, msgs was a flood of crap and would have overwhelmed them.
I have a question that's really more suited for the BBWin mailing list, but I've asked it there and gotten no response: Does anyone have a complete server-side configuration example for BBWin clients, showing how to handle all aspects of the client configuration?
Thanks, Shawn
Kauffman, Tom wrote:
We haven't been putting the Windows Server msgs column on our bb2 page, nor alerting on msgs, because of the number of events that seem to trigger warnings or errors.
I have very chatty windows boxes as well,, where do you place these lists? Which file?
-Gavin
-----Original Message----- From: Shawn Heisey [mailto:hobbit at elyograg.org] Sent: Thursday, October 09, 2008 11:54 AM To: hobbit at hswn.dk Subject: Re: [hobbit] Looking for sample BBWIN configs for filtering Windows event logs
Here's our typical list:
<ignore logfile="System" eventid="2" />
<ignore logfile="System" eventid="3" />
<ignore logfile="System" eventid="4" />
<ignore logfile="System" eventid="8" />
<ignore logfile="System" eventid="1106" />
<ignore logfile="System" eventid="1111" />
<ignore logfile="Application" eventid="3033" />
<ignore logfile="Application" eventid="2003" />
ID 3033 is an Exchange message relating to Windows Mobile clients, but because Exchange was the first server I converted to BBWin from Big Brother, it's ended up on all of the systems. ID 2003 is related to performance counters. It's probably possible to fix, but my focus is not so much on the Windows infrastructure.
The rest are the annoying printer driver entries that you get when you log into a machine via Remote Desktop and are forwarding printers but don't have drivers on the system. I tried for a long time to get people to turn off printer forwarding, because I could never get Big Brother to stop alarming, but nobody listened. Hobbit/BBWin has been a lifesaver in this respect. With a little more work, we will be able to soon include the NOC in all alarms. With Big Brother, msgs was a flood of crap and would have overwhelmed them.
I have a question that's really more suited for the BBWin mailing list, but I've asked it there and gotten no response: Does anyone have a complete server-side configuration example for BBWin clients, showing how to handle all aspects of the client configuration?
Thanks, Shawn
Kauffman, Tom wrote:
We haven't been putting the Windows Server msgs column on our bb2 page, nor alerting on msgs, because of the number of events that seem to trigger warnings or errors.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Thu, Oct 9, 2008 at 10:54 AM, Shawn Heisey <hobbit at elyograg.org> wrote:
I have a question that's really more suited for the BBWin mailing list, but I've asked it there and gotten no response: Does anyone have a complete server-side configuration example for BBWin clients, showing how to handle all aspects of the client configuration?
This is the one that I am using. I still have some cleanup to do on it though....
###########################################################
The defaults used by the Hobbit clients
########################################################### DEFAULT UP 30m DISK * 90 95 SWAP 85 90 MEMPHYS 100 101 MEMSWAP 90 95 MEMACT 90 97 CLOCK 30
###########################################################
Windows Based Systems - Central Config Mode
########################################################### CLASS=%win32* EXHOST=server1,server2 LOAD 80 90 # Load thresholds are in % PROC svchost.exe 2 -1 PROC %[mM]cshield.exe 1 -1 PROC nserver.exe 1 -1 PROC nrouter.exe 1 -1 LOG %.* %.*error.* COLOR=red IGNORE=%(BigBrotherHobbitClient|SnapDrive|WinVNC4|TermDD|SV-GSX|TermServDevices|Perflib|PerfNet)
So far its worked out pretty well as my default setting... After the Default section and before the generic section above I have my system specific entries...
-- --==[ Bob Gordon ]==--
It looks like the ignore section only uses text matches, in this case regular expressions, right? That would mean it can't match on event ID unless I encode something like "Print (8)" in a regular expression format.
Not that this is a huge problem, but having a nice clean field like event ID is one of the good things about BBWin's local config mode. I'm just tired of having to remote into the client to change something, especially when I have to do it on more than one client.
Thanks for the info! Only one more thing I'd want - do you have an examples of centrally defined service monitoring?
Bob Gordon wrote:
On Thu, Oct 9, 2008 at 10:54 AM, Shawn Heisey <hobbit at elyograg.org <mailto:hobbit at elyograg.org>> wrote:
I have a question that's really more suited for the BBWin mailing list, but I've asked it there and gotten no response: Does anyone have a complete server-side configuration example for BBWin clients, showing how to handle all aspects of the client configuration?This is the one that I am using. I still have some cleanup to do on it though....
###########################################################
The defaults used by the Hobbit clients
########################################################### DEFAULT UP 30m DISK * 90 95 SWAP 85 90 MEMPHYS 100 101 MEMSWAP 90 95 MEMACT 90 97 CLOCK 30
###########################################################
Windows Based Systems - Central Config Mode
########################################################### CLASS=%win32* EXHOST=server1,server2 LOAD 80 90 # Load thresholds are in % PROC svchost.exe 2 -1 PROC %[mM]cshield.exe 1 -1 PROC nserver.exe 1 -1 PROC nrouter.exe 1 -1 LOG %.* %.*error.* COLOR=red IGNORE=%(BigBrotherHobbitClient|SnapDrive|WinVNC4|TermDD|SV-GSX|TermServDevices|Perflib|PerfNet)
So far its worked out pretty well as my default setting... After the Default section and before the generic section above I have my system specific entries...
On Thu, Oct 9, 2008 at 1:04 PM, Shawn Heisey <elyograg at elyograg.org> wrote:
It looks like the ignore section only uses text matches, in this case regular expressions, right? That would mean it can't match on event ID unless I encode something like "Print (8)" in a regular expression format.
Not that this is a huge problem, but having a nice clean field like event ID is one of the good things about BBWin's local config mode. I'm just tired of having to remote into the client to change something, especially when I have to do it on more than one client.
Thanks for the info! Only one more thing I'd want - do you have an examples of centrally defined service monitoring?
In my case I found it easier to match based on the text rather than the ID. You should be able to match on the ID rather than the text though...
The entries that I am doing service monitoring on have entries similar to these:
SVC "RFBOARD" startup=manual status=started color=red
SVC "RFDB" startup=manual status=started color=red
Regards,
-- --==[ Bob Gordon ]==--
Bob Gordon wrote:
In my case I found it easier to match based on the text rather than the ID. You should be able to match on the ID rather than the text though...
The entries that I am doing service monitoring on have entries similar to these: SVC "RFBOARD" startup=manual status=started color=red SVC "RFDB" startup=manual status=started color=red
This is going to be so incredibly helpful. Do you happen to know if the uptime "yellow alarm" value can be centrally controlled? I've got some Windows machines that have been up for more than 1000 days, so BBWin reports yellow. I haven't bothered with changing the value because right now I'd have to do it on dozens of machines individually.
The systems with high uptimes are in continuous production, so I haven't been able to convince the integration folks to install updates and get them rebooted. They're firmly behind firewalls, so I am not SUPER concerned about their security.
Shawn, you can configure Terminal Services to turn off printer redirection. It can be done in a group policy object (GPO) or it can be done to each server individually.
See this link for a more visual discussion on the subject: http://blogs.technet.com/askperf/archive/2007/08/24/terminal-server-and-prin...
Ray
----Original Message---- From: Shawn Heisey [mailto:hobbit at elyograg.org] Sent: Thursday, October 09, 2008 1:54 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Looking for sample BBWIN configs for filtering Windows event logs
Here's our typical list:
<ignore logfile="System" eventid="2" /> <ignore logfile="System" eventid="3" /> <ignore logfile="System" eventid="4" /> <ignore logfile="System" eventid="8" /> <ignore logfile="System" eventid="1106" /> <ignore logfile="System" eventid="1111" /> <ignore logfile="Application" eventid="3033" /> <ignore logfile="Application" eventid="2003" />ID 3033 is an Exchange message relating to Windows Mobile clients, but because Exchange was the first server I converted to BBWin from Big Brother, it's ended up on all of the systems. ID 2003 is related to performance counters. It's probably possible to fix, but my focus is not so much on the Windows infrastructure.
The rest are the annoying printer driver entries that you get when you log into a machine via Remote Desktop and are forwarding printers but don't have drivers on the system. I tried for a long time to get people to turn off printer forwarding, because I could never get Big Brother to stop alarming, but nobody listened. Hobbit/BBWin has been a lifesaver in this respect. With a little more work, we will be able to soon include the NOC in all alarms. With Big Brother, msgs was a flood of crap and would have overwhelmed them.
I have a question that's really more suited for the BBWin mailing list, but I've asked it there and gotten no response: Does anyone have a complete server-side configuration example for BBWin clients, showing how to handle all aspects of the client configuration?
Thanks, Shawn
Sometimes we do actually want to be able to print from the remote machine to a local printer, so I have left the capability there.
Storer, Raymond wrote:
Shawn, you can configure Terminal Services to turn off printer redirection. It can be done in a group policy object (GPO) or it can be done to each server individually.
See this link for a more visual discussion on the subject: http://blogs.technet.com/askperf/archive/2007/08/24/terminal-server-and-prin...
Bob,
FWIW.
In server/etc/hobbit-clients.cfg file the last few lines of the comments at the top says
# The special DEFAULT section can modify the built-in defaults - this
must # be placed at the end of the file.
Robert
On 10/9/08 3:12 PM, "Bob Gordon" <rgordonjr at gmail.com> wrote:
On Thu, Oct 9, 2008 at 10:54 AM, Shawn Heisey <hobbit at elyograg.org> wrote:
I have a question that's really more suited for the BBWin mailing list, but I've asked it there and gotten no response: Does anyone have a complete server-side configuration example for BBWin clients, showing how to handle all aspects of the client configuration?
This is the one that I am using. I still have some cleanup to do on it though....
###########################################################
The defaults used by the Hobbit clients
########################################################### DEFAULT UP 30m DISK * 90 95 SWAP 85 90 MEMPHYS 100 101 MEMSWAP 90 95 MEMACT 90 97 CLOCK 30
###########################################################
Windows Based Systems - Central Config Mode
########################################################### CLASS=%win32* EXHOST=server1,server2 LOAD 80 90 # Load thresholds are in % PROC svchost.exe 2 -1 PROC %[mM]cshield.exe 1 -1 PROC nserver.exe 1 -1 PROC nrouter.exe 1 -1 LOG %.* %.*error.* COLOR=red IGNORE=%(BigBrotherHobbitClient|SnapDrive|WinVNC4|TermDD|SV-GSX|TermServDevice s|Perflib|PerfNet)
So far its worked out pretty well as my default setting... After the Default section and before the generic section above I have my system specific entries...
-- Robert P. McGraw, Jr. Manager, Computer System EMAIL: rmcgraw at purdue.edu Purdue University ROOM: MATH-807 Department of Mathematics PHONE: (765) 494-6055 150 N. University Street West Lafayette, IN 47907-2067
On Thu, Oct 9, 2008 at 1:47 PM, McGraw, Robert P <rmcgraw at purdue.edu> wrote:
Bob,
FWIW.
In server/etc/hobbit-clients.cfg file the last few lines of the comments at the top says
# The special DEFAULT section can modify the built-in defaults - thismust # be placed at the end of the file.
Thanks for pointing that out (wonder when it changed).. The config file I have has been used since the first 4.x version when it finished with:
Rules are evaluated from the top of this file and down, and the first
matching rule is used. So you should put the specific rules first, and
the generic rules last.
EXHOST=%^pto\.linuxbog\.dk|\.sslug\.dk|^bb-mws.csc.dk|sarge|trantor|postcode
EXSERVICE=dnsinfo,dnsreg
So far it seems to be working but I will go through and double check to make sure it actually is... ;)
-- --==[ Bob Gordon ]==--
participants (11)
-
elyograg@elyograg.org
-
gleonard@progrexion.com
-
henrik@hswn.dk
-
hobbit@elyograg.org
-
jon@shadowsoft.com
-
KauffmanT@nibco.com
-
Rafal.Roginela@AmeriCashLoans.net
-
rgordonjr@gmail.com
-
rmcgraw@purdue.edu
-
storerr@nibco.com
-
Vernon.Everett@woodside.com.au