Hello,
Using Xymon 4.3.10 I wanted to remove a host completely, so used the following command on the Xymon server:
xymon xxx.xxx.xxx.xxx "drop lib-srvr10"
This removed all of the host data except for the 'lib-srvr10' subdirectory in the 'data/hostdata' directory. The subdirectory still contained all of the data files for the host.
I had already removed the host from the hosts.cfg file, and restarted Xymon. No errors were seen in any of the log files.
John.
-- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
Hello,
Using Xymon 4.3.10 I wanted to remove a host completely, so used the following command on the Xymon server:
xymon xxx.xxx.xxx.xxx "drop lib-srvr10"
This removed all of the host data except for the 'lib-srvr10' subdirectory in the 'data/hostdata' directory. The subdirectory still contained all of the data files for the host.
I had already removed the host from the hosts.cfg file, and restarted Xymon. No errors were seen in any of the log files.
Hmm. Just ran this myself on the occasion of having a host to drop and it seemed to be working okay. Worked both for a host I was dropping "live" and one that had already been deleted a restart or two ago ; in both cases the directory under hostdata/* was removed properly.
Is this still occurring for you? Had it happened before?
Regards,
-jc
On Wed, 2013-07-17 at 00:38 +0000, cleaver at terabithia.org wrote:
Hello,
Using Xymon 4.3.10 I wanted to remove a host completely, so used the following command on the Xymon server:
xymon xxx.xxx.xxx.xxx "drop lib-srvr10"
This removed all of the host data except for the 'lib-srvr10' subdirectory in the 'data/hostdata' directory. The subdirectory still contained all of the data files for the host.
I had already removed the host from the hosts.cfg file, and restarted Xymon. No errors were seen in any of the log files.
Hmm. Just ran this myself on the occasion of having a host to drop and it seemed to be working okay. Worked both for a host I was dropping "live" and one that had already been deleted a restart or two ago ; in both cases the directory under hostdata/* was removed properly.
Is this still occurring for you? Had it happened before?
I dropped two hosts and neither had their directory removed under hostdata.
I think it has happened before because when it happened this time I thought I had already reported it as a bug. I couldn't find any record of that though, hence the report this time.
John.
-- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
On Wed, 2013-07-17 at 00:38 +0000, cleaver at terabithia.org wrote:
Hello,
Using Xymon 4.3.10 I wanted to remove a host completely, so used the following command on the Xymon server:
xymon xxx.xxx.xxx.xxx "drop lib-srvr10"
This removed all of the host data except for the 'lib-srvr10' subdirectory in the 'data/hostdata' directory. The subdirectory still contained all of the data files for the host.
I had already removed the host from the hosts.cfg file, and restarted Xymon. No errors were seen in any of the log files.
Hmm. Just ran this myself on the occasion of having a host to drop and it seemed to be working okay. Worked both for a host I was dropping "live" and one that had already been deleted a restart or two ago ; in both cases the directory under hostdata/* was removed properly.
Is this still occurring for you? Had it happened before?
I can probably test this with some test servers. However, where does the deletion of 'hostdata' directories occur? I'll add some logging/debugging to see if I can see what is going on.
John.
-- John Horne, Plymouth University, UK Tel: +44 (0)1752 587287 Fax: +44 (0)1752 587001
On Wed, 2013-07-17 at 00:38 +0000, cleaver at terabithia.org wrote:
Hello,
Using Xymon 4.3.10 I wanted to remove a host completely, so used the following command on the Xymon server:
xymon xxx.xxx.xxx.xxx "drop lib-srvr10"
This removed all of the host data except for the 'lib-srvr10' subdirectory in the 'data/hostdata' directory. The subdirectory still contained all of the data files for the host.
I had already removed the host from the hosts.cfg file, and restarted Xymon. No errors were seen in any of the log files.
Hmm. Just ran this myself on the occasion of having a host to drop and it seemed to be working okay. Worked both for a host I was dropping "live" and one that had already been deleted a restart or two ago ; in both cases the directory under hostdata/* was removed properly.
Is this still occurring for you? Had it happened before?
I can probably test this with some test servers. However, where does the deletion of 'hostdata' directories occur? I'll add some logging/debugging to see if I can see what is going on.
The deletion is handled by xymond_hostdata.c when it gets a @@drophost message (which gets sent to all channels by xymond and bypasses xymond_channel filters). It's around 263 in the .c code in my copy. == snip == else if ((metacount > 3) && (strncmp(metadata[0], "@@drophost", 10) == 0)) { /* @@drophost|timestamp|sender|hostname */ char hostdir[PATH_MAX]; sprintf(hostdir, "%s/%s", clientlogdir, metadata[3]); dropdirectory(hostdir, 1); } == snip == At the risk of stating the obvious :) , did you double-check owner and permissions on the dir? -jc
On Sat, 2013-07-20 at 19:10 +0000, cleaver at terabithia.org wrote:
On Wed, 2013-07-17 at 00:38 +0000, cleaver at terabithia.org wrote:
Hello,
Using Xymon 4.3.10 I wanted to remove a host completely, so used the following command on the Xymon server:
xymon xxx.xxx.xxx.xxx "drop lib-srvr10"
This removed all of the host data except for the 'lib-srvr10' subdirectory in the 'data/hostdata' directory. The subdirectory still contained all of the data files for the host.
I had already removed the host from the hosts.cfg file, and restarted Xymon. No errors were seen in any of the log files.
Hmm. Just ran this myself on the occasion of having a host to drop and it seemed to be working okay. Worked both for a host I was dropping "live" and one that had already been deleted a restart or two ago ; in both cases the directory under hostdata/* was removed properly.
Is this still occurring for you? Had it happened before?
I can probably test this with some test servers. However, where does the deletion of 'hostdata' directories occur? I'll add some logging/debugging to see if I can see what is going on.
The deletion is handled by xymond_hostdata.c when it gets a @@drophost message (which gets sent to all channels by xymond and bypasses xymond_channel filters). It's around 263 in the .c code in my copy.
== snip == else if ((metacount > 3) && (strncmp(metadata[0], "@@drophost", 10) == 0)) { /* @@drophost|timestamp|sender|hostname */ char hostdir[PATH_MAX]; sprintf(hostdir, "%s/%s", clientlogdir, metadata[3]); dropdirectory(hostdir, 1); } == snip ==
At the risk of stating the obvious :) , did you double-check owner and permissions on the dir?
Hello, Yes I rechecked the ownership and permissions on the directory path and filenames. They are fine :-) I've started xymond_hostdata with the '--debug' option, but am not seeing the '@@drophost' command being received. The main xymond log sees the actual 'drop jhvm2' command though ('jhvm2' is the test host name). So I am currently testing the above bit of code to see that it is actually being reached. John. -- John Horne Tel: +44 (0)1752 587287 Plymouth University, UK Fax: +44 (0)1752 587001
On Mon, 2013-07-22 at 12:21 +0100, John Horne wrote:
I've started xymond_hostdata with the '--debug' option, but am not seeing the '@@drophost' command being received. The main xymond log sees the actual 'drop jhvm2' command though ('jhvm2' is the test host name). So I am currently testing the above bit of code to see that it is actually being reached.
Well I'm a bit stumped with this. I have added several dbgprintf statements (which begin with 'JH:') to both xymond.c and xymond_hostdata.c. I also modified tasks.cfg so that xymond and xymond_hostdata started with the '--debug' option. The log files show the command being received and sent to the xymon channels. However, the hostdata.log does not show it being received.
From xymond.log:
==================================== 29942 2013-07-22 18:32:59 -> do_message/1 (10 bytes): drop jhvm2 29942 2013-07-22 18:32:59 -> update_statistics 29942 2013-07-22 18:32:59 <- update_statistics 29942 2013-07-22 18:32:59 -> oksender 29942 2013-07-22 18:32:59 <- oksender(1-a) 29942 2013-07-22 18:32:59 -> handle_dropnrename 29942 2013-07-22 18:32:59 JH: In handle_dropnrename: host is jhvm2 29942 2013-07-22 18:32:59 JH: About to call posttochannel: statuschn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg 29942 2013-07-22 18:32:59 JH: In posttochannel: command is: @@drophost#195/*|1374514379.136454|141.163.66.133|jhvm2 29942 2013-07-22 18:32:59 Posting message 195 to 1 readers 29942 2013-07-22 18:32:59 <- posttochannel 29942 2013-07-22 18:32:59 JH: About to call posttochannel: stachgchn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg 29942 2013-07-22 18:32:59 JH: In posttochannel: command is: @@drophost#195/*|1374514379.136550|141.163.66.133|jhvm2 29942 2013-07-22 18:32:59 Posting message 195 to 1 readers 29942 2013-07-22 18:32:59 <- posttochannel 29942 2013-07-22 18:32:59 JH: About to call posttochannel: pagechn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg 29942 2013-07-22 18:32:59 JH: In posttochannel: command is: @@drophost#1/*|1374514379.136624|141.163.66.133|jhvm2 29942 2013-07-22 18:32:59 Posting message 1 to 1 readers 29942 2013-07-22 18:32:59 <- posttochannel 29942 2013-07-22 18:32:59 JH: About to call posttochannel: datachn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg 29942 2013-07-22 18:32:59 JH: In posttochannel: command is: @@drophost#22/*|1374514379.136739|141.163.66.133|jhvm2 29942 2013-07-22 18:32:59 Posting message 22 to 1 readers 29942 2013-07-22 18:32:59 <- posttochannel 29942 2013-07-22 18:32:59 JH: About to call posttochannel: noteschn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 Dropping message - no readers 29942 2013-07-22 18:32:59 JH: About to call posttochannel: enadischn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 Dropping message - no readers 29942 2013-07-22 18:32:59 JH: About to call posttochannel: clientchn 29942 2013-07-22 18:32:59 -> posttochannel 29942 2013-07-22 18:32:59 JH: In posttochannel: readymsg 29942 2013-07-22 18:32:59 JH: In posttochannel: command is: @@drophost#4/*|1374514379.136890|141.163.66.133|jhvm2 29942 2013-07-22 18:32:59 Posting message 4 to 1 readers 29942 2013-07-22 18:32:59 <- posttochannel ==================================== Basically this shows 'dropnrename' being called with the host name 'jhvm2'. It then calls 'posttochannel' for each channel, where the message is either dropped if there are no readers, or is sent on with the '@@drophost... jhvm2' command.
From hostdata.log:
==================================== 21100 2013-07-22 18:07:59 JH: xymond_hostdata starting: clientlogdir is: /home/xymon/data/hostdata 21100 2013-07-22 18:07:59 Want msg 1, startpos 0, fillpos 0, endpos -1, usedbytes=0, bufleft=2101247 21100 2013-07-22 18:07:59 Got 44 bytes 21100 2013-07-22 18:07:59 xymond_hostdata: Got message 1 @@shutdown#1/*| 1374512879.482598|xymond| 21100 2013-07-22 18:07:59 startpos 44, fillpos 44, endpos -1 2013-07-22 18:31:16 Peer not up, flushing message queue ==================================== So the command is accepted by xymond and sent on, but not received by xymond_hostdata. Unfortunately (?) this IPC is controlled by semaphores, so seeing as to why xymond_hostdata does not pick up the message may be difficult. John. -- John Horne, Plymouth University, UK Tel: +44 (0)1752 587287 Fax: +44 (0)1752 587001
On Mon, 2013-07-22 at 12:21 +0100, John Horne wrote:
I've started xymond_hostdata with the '--debug' option, but am not seeing the '@@drophost' command being received. The main xymond log sees the actual 'drop jhvm2' command though ('jhvm2' is the test host name). So I am currently testing the above bit of code to see that it is actually being reached.
Well I'm a bit stumped with this.
I have added several dbgprintf statements (which begin with 'JH:') to both xymond.c and xymond_hostdata.c. I also modified tasks.cfg so that xymond and xymond_hostdata started with the '--debug' option.
The log files show the command being received and sent to the xymon channels. However, the hostdata.log does not show it being received.
*snip*
So the command is accepted by xymond and sent on, but not received by xymond_hostdata.
Unfortunately (?) this IPC is controlled by semaphores, so seeing as to why xymond_hostdata does not pick up the message may be difficult.
A ha! It all makes sense now... :) The root of this is actually a known issue: drop commands not getting sent to the CLICHG channel, which is why it only shows up here and not, say, xymond_rrd... xymond_hostdata is the only built-in that uses this one. It's patched in the Terabithia RPMs from last month, but it's not in the release tarball yet. The attached file should resolve it for you; can you verify? HTH, -jc
On Mon, 2013-07-22 at 18:40 +0000, cleaver at terabithia.org wrote:
On Mon, 2013-07-22 at 12:21 +0100, John Horne wrote:
So the command is accepted by xymond and sent on, but not received by xymond_hostdata.
Unfortunately (?) this IPC is controlled by semaphores, so seeing as to why xymond_hostdata does not pick up the message may be difficult.
A ha! It all makes sense now... :) The root of this is actually a known issue: drop commands not getting sent to the CLICHG channel, which is why it only shows up here and not, say, xymond_rrd... xymond_hostdata is the only built-in that uses this one.
It's patched in the Terabithia RPMs from last month, but it's not in the release tarball yet. The attached file should resolve it for you; can you verify?
Hello,
Yes that patch works fine :-) The hostdata log now shows the directory (and files) being deleted (and I checked that they had actually been deleted from the disk).
Many thanks for your help.
John.
-- John Horne, Plymouth University, UK Tel: +44 (0)1752 587287 Fax: +44 (0)1752 587001
On 22-07-2013 22:04, John Horne wrote:
On Mon, 2013-07-22 at 18:40 +0000, cleaver at terabithia.org wrote:
A ha! It all makes sense now... :) The root of this is actually a known issue: drop commands not getting sent to the CLICHG channel, which is why it only shows up here and not, say, xymond_rrd... xymond_hostdata is the only built-in that uses this one.
Yes that patch works fine :-) The hostdata log now shows the directory (and files) being deleted (and I checked that they had actually been deleted from the disk).
I've just added this for 4.3.12.
Regards, Henrik
participants (3)
-
cleaver@terabithia.org
-
henrik@hswn.dk
-
john.horne@plymouth.ac.uk