On Mon, 2013-07-22 at 12:21 +0100, John Horne wrote:
I've started xymond_hostdata with the '--debug' option, but am not seeing the '@@drophost' command being received. The main xymond log sees the actual 'drop jhvm2' command though ('jhvm2' is the test host name). So I am currently testing the above bit of code to see that it is actually being reached.
Well I'm a bit stumped with this. I have added several dbgprintf statements (which begin with 'JH:') to both xymond.c and xymond_hostdata.c. I also modified tasks.cfg so that xymond and xymond_hostdata started with the '--debug' option. The log files show the command being received and sent to the xymon channels. However, the hostdata.log does not show it being received.
From xymond.log:
==================================== 29942 2013-07-22 18:32:59 -> do_message/1 (10 bytes): drop jhvm2 29942 2013-07-22 18:32:59 -> update_statistics 29942 2013-07-22 18:32:59 <- update_statistics 29942 2013-07-22 18:32:59 -> oksender 29942 2013-07-22 18:32:59 <- oksender(1-a) 29942 2013-07-22 18:32:59 -> handle_dropnrename 29942 2013-07-22 18:32:59 JH: In handle_dropnrename: host is jhvm2 29942 2013-07-22 18:32:59 JH: About to call posttochannel: statuschn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg 29942 2013-07-22 18:32:59 JH: In posttochannel: command is: @@drophost#195/*|1374514379.136454|141.163.66.133|jhvm2 29942 2013-07-22 18:32:59 Posting message 195 to 1 readers 29942 2013-07-22 18:32:59 <- posttochannel 29942 2013-07-22 18:32:59 JH: About to call posttochannel: stachgchn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg 29942 2013-07-22 18:32:59 JH: In posttochannel: command is: @@drophost#195/*|1374514379.136550|141.163.66.133|jhvm2 29942 2013-07-22 18:32:59 Posting message 195 to 1 readers 29942 2013-07-22 18:32:59 <- posttochannel 29942 2013-07-22 18:32:59 JH: About to call posttochannel: pagechn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg 29942 2013-07-22 18:32:59 JH: In posttochannel: command is: @@drophost#1/*|1374514379.136624|141.163.66.133|jhvm2 29942 2013-07-22 18:32:59 Posting message 1 to 1 readers 29942 2013-07-22 18:32:59 <- posttochannel 29942 2013-07-22 18:32:59 JH: About to call posttochannel: datachn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg 29942 2013-07-22 18:32:59 JH: In posttochannel: command is: @@drophost#22/*|1374514379.136739|141.163.66.133|jhvm2 29942 2013-07-22 18:32:59 Posting message 22 to 1 readers 29942 2013-07-22 18:32:59 <- posttochannel 29942 2013-07-22 18:32:59 JH: About to call posttochannel: noteschn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 Dropping message - no readers 29942 2013-07-22 18:32:59 JH: About to call posttochannel: enadischn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 Dropping message - no readers 29942 2013-07-22 18:32:59 JH: About to call posttochannel: clientchn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg 29942 2013-07-22 18:32:59 JH: In posttochannel: command is: @@drophost#4/*|1374514379.136890|141.163.66.133|jhvm2 29942 2013-07-22 18:32:59 Posting message 4 to 1 readers 29942 2013-07-22 18:32:59 <- posttochannel ==================================== Basically this shows 'dropnrename' being called with the host name 'jhvm2'. It then calls 'posttochannel' for each channel, where the message is either dropped if there are no readers, or is sent on with the '@@drophost... jhvm2' command.
From hostdata.log:
==================================== 21100 2013-07-22 18:07:59 JH: xymond_hostdata starting: clientlogdir is: /home/xymon/data/hostdata 21100 2013-07-22 18:07:59 Want msg 1, startpos 0, fillpos 0, endpos -1, usedbytes=0, bufleft=2101247 21100 2013-07-22 18:07:59 Got 44 bytes 21100 2013-07-22 18:07:59 xymond_hostdata: Got message 1 @@shutdown#1/*| 1374512879.482598|xymond| 21100 2013-07-22 18:07:59 startpos 44, fillpos 44, endpos -1 2013-07-22 18:31:16 Peer not up, flushing message queue ==================================== So the command is accepted by xymond and sent on, but not received by xymond_hostdata. Unfortunately (?) this IPC is controlled by semaphores, so seeing as to why xymond_hostdata does not pick up the message may be difficult. John. -- John Horne, Plymouth University, UK Tel: +44 (0)1752 587287 Fax: +44 (0)1752 587001