*Very* helpful, JC.
Thx!
david ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~ David Mills Systems Administrator Northrop Grumman (512) 873-6665
From: cleaver at terabithia.org [cleaver at terabithia.org] Sent: Tuesday, July 09, 2013 1:48 PM To: Mills, David (IS) Cc: xymon at xymon.com Subject: EXT :Re: [Xymon] xymond_filestore: crashed but still working (too well?)
All --
Am turning on the "xymond_filestore" worker module for the first time and have a couple of odd things (to my mind) happening:
- The status icon in Xymon ("xymond_filestore") is red, and the details page says "Program crashed / Fatal signal caught!". Found the core file under ~xymon/server/tmp/ and pstack shows:
core '/home/hobbit/xymon/server/tmp/core' of 16895: xymond_filestore --data --debug ff13ebd4 _lwp_kill (6, 0, 0, ff11e0f0, ffffffff, 6) + 8 ff0b29f0 abort (0, 1, 42b94, ffb04, ff1b5518, 0) + 110 0001f170 sigsegv_handler (b, 0, ffbfc830, 1, 0, 0) + 30 ff13b00c __sighndlr (b, 0, ffbfc830, 1f140, 0, 1) + c ff12f6bc call_user_handler (b, 0, 0, 0, ff302a00, ffbfc830) + 3b8 ff12f8a4 sigacthandler (b, 0, ffbfc830, ffffffff, 0, 0) + 60 --- called from signal handler with signal 11 (SIGSEGV) --- ff11f858 fwrite (4ba93, 16d, 1, 0, ff0000, 80808080) + 8 00013c08 update_file (ffbfd4d0, 0, 4ba93, 0, 0, ffffffff) + 78 00014a1c main (2c000, 4ba55, 4, 0, 0, ffbfd4d0) + 99c 00013a28 _start (0, 0, 0, 0, 0, 0) + 5c
(Running v. 4.3.3 / Solaris 10, FWIW)
Yikes. IIRC there were some changes in this code over time; I'd be curious if this works ok with 4.3.11.
- I turned on xymond_filestore because I'm writing a server-side script to analyze incoming data and assign status. I imagined I would *only* get files under $XYMONDATADIR sent from my custom client script, but instead I'm getting zillions (actually a few thousand) showing up in that dir. Can someone please explain why I'm getting all these client "data" files in this dir?
You'll actually be getting a copy of all data messages coming through on the channel. xymond_client takes the incoming client message and turns it in to both status *and* data messages. The status messages are what you see as test results, but the data messages include things like parsed vmstat/ifstat data. The reason xymond_rrd is configured to run twice (one listening to the status channel, once to the data channel) is so that it can pick up both feeds and make pretty RRDs out of -- say -- bandwidth in the trends page.
More to my point -- is there any way to restrict only the data files I want to appear in this dir?
I think xymond_filestore's "--only=test[,test,test]" filter might work for data messages, but I'm not certain. Another option is to use --filter= at the xymond_channel level, but it depends on how fine-tuned your requirements are (that option is a simple regex include).
If the box you're on has a lot of ram, I've found using a tmpfs for the filestore /data directory can be quite convenient. If you're doing background analysis of data messages, it's sometimes easier to grep a dir than create another xymond_channel listening script and wait.
HTH,
-jc