Histlog file format anomaly - epochtimestamps
Hi y'all
We've setup some new RHEL 7.9 Xymon servers running 4.3.30-1 from the Terabithia repo. Everything seems to work fine except that the event history list ("last 50 events") has links that use the epoch time format for the individual events, but the files in $XYMONHISTLOGS directories contain files in the old format such as Fri_Aug_25_09:59:03_2023. When I click on an event dot, I'm given a webpage with the message "Historical status log not available". If I rename/symlink the file to the epoch time format (eg 1692921543) then it displays just fine.
I'm aware of a patch in the Terabithia packages that change the filename format. But reading through the patch (from the SRPM), it looks like it's supposed to handle both old and new formats just fine. One of the things the patch provides is the ability to override the filename format using either the "--epochtimestamps" switch or setting the "EPOCHIST" variable, to (I think) tell "xymond_history" which format to use when creating new history logfiles. Without these, the code should assume write format, but read either format.
So, a couple of things here:
With the patch in place, new history log files should be created using the new filename format. But they're being created with the old format.
With the patch in place, existing history log files with filenames in the old format should still be readable by the CGI scripts. But they history.sh file is failing when it can't find filenames in the new format.
Running strings on the Xymon binaries doesn't show the strings "epochtimestamps" or "EPOCHHIST". So I don't think the patch has been fully applied. And I don't think I can use the --epochtimestamps option to fix it.
So I'm not sure if I've done something fundamentally wrong here (probably), or (less likely) if the patch in the SRPM has not cleanly applied when the the RPM was built. Does anyone else have the Terabithia 4.3.30-1 package installed on RHELv7.9, who can tell me what format your history status files (under /var/lib/xymon/histlog/<hostname>/<testname>) are in? And can tell me if the history events display correctly?
J
On Thu, August 24, 2023 19:42, Jeremy Laidman wrote:
Hi y'all
We've setup some new RHEL 7.9 Xymon servers running 4.3.30-1 from the Terabithia repo. Everything seems to work fine except that the event history list ("last 50 events") has links that use the epoch time format for the individual events, but the files in $XYMONHISTLOGS directories contain files in the old format such as Fri_Aug_25_09:59:03_2023. When I click on an event dot, I'm given a webpage with the message "Historical status log not available". If I rename/symlink the file to the epoch time format (eg 1692921543) then it displays just fine.
I had to double-check this one, but I can confirm that on Terabithia (w/o any further configuration) the links generated look something like: http://127.0.0.1:2180/xymon-cgi/historylog.sh?HOST=trampoline.local&SERVICE=...
However the files in the directory are: [root at trampoline histlogs]# find -name Fri_Aug_25_00:55:58_2023 ./trampoline_local/cpu/Fri_Aug_25_00:55:58_2023 ./trampoline_local/disk/Fri_Aug_25_00:55:58_2023 ./trampoline_local/inode/Fri_Aug_25_00:55:58_2023 ./trampoline_local/procs/Fri_Aug_25_00:55:58_2023 ./trampoline_local/ports/Fri_Aug_25_00:55:58_2023 ./trampoline_local/msgs/Fri_Aug_25_00:55:58_2023 ./trampoline_local/files/Fri_Aug_25_00:55:58_2023 ./trampoline_local/memory/Fri_Aug_25_00:55:58_2023
And they are being pulled up correctly.
After adding --epochtimestamps to the xymond_history CMD: [history] ENVFILE /etc/xymon/xymonserver.cfg NEEDS xymond CMD xymond_channel --channel=stachg xymond_history --epochtimestamps LOGFILE $XYMONSERVERLOGS/history.log PIDFILE $XYMONRUNDIR/xymond_history.pid
We are now storing new records in epoch:
[root at trampoline histlogs]# ls -lt trampoline_local/memory/ total 20 -rw-r--r--. 1 xymon xymon 374 Aug 25 02:12 1692954767 -rw-r--r--. 1 xymon xymon 157 Aug 25 02:12 1692954733 -rw-r--r--. 1 xymon xymon 374 Aug 25 00:55 Fri_Aug_25_00:55:58_2023 -rw-r--r--. 1 xymon xymon 374 Aug 24 15:38 Thu_Aug_24_15:38:30_2023 -rw-r--r--. 1 xymon xymon 374 Aug 24 12:37 Thu_Aug_24_12:37:39_2023
However the main event and event history pages still seem to pull both up fine.
I'm aware of a patch in the Terabithia packages that change the filename format. But reading through the patch (from the SRPM), it looks like it's supposed to handle both old and new formats just fine. One of the things the patch provides is the ability to override the filename format using either the "--epochtimestamps" switch or setting the "EPOCHIST" variable, to (I think) tell "xymond_history" which format to use when creating new history logfiles. Without these, the code should assume write format, but read either format.
Correct. The intent was to use epoch for all *links* now, but not to adjust the default output file pattern unless flagged to. (This default would change in 4.4.) And yes, it needs to be able to read both for sure.
So, a couple of things here:
With the patch in place, new history log files should be created using the new filename format. But they're being created with the old format.
With the patch in place, existing history log files with filenames in the old format should still be readable by the CGI scripts. But they history.sh file is failing when it can't find filenames in the new format.
Running strings on the Xymon binaries doesn't show the strings "epochtimestamps" or "EPOCHHIST". So I don't think the patch has been fully applied. And I don't think I can use the --epochtimestamps option to fix it.
The only binary that I think the string will end up in is xymond_history itself. The various lib/*.c sources are only referenced there:
[root at trampoline histlogs]# grep epochtimestamps /usr/libexec/xymon/xymond_history Binary file /usr/libexec/xymon/xymond_history matches
If that's not showing up, then the patch definitely isn't applied.
I'm a little perplexed at what could be going on here, though. It's possible there is a missed bug in here somehow, but this functionality was added quite a while back.
The biggest drawback for the histlog format (in my mind) was lack of TZ data in it. Is it possible the epoch is not being converted correctly back somehow I wonder?
-jc
So I'm not sure if I've done something fundamentally wrong here (probably), or (less likely) if the patch in the SRPM has not cleanly applied when the the RPM was built. Does anyone else have the Terabithia 4.3.30-1 package installed on RHELv7.9, who can tell me what format your history status files (under /var/lib/xymon/histlog/<hostname>/<testname>) are in? And can tell me if the history events display correctly?
J
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Hi, I am sure JC has already thought of this. But remember that the history-log clean up utility must also work with the new naming convention.
Regards, Henrik
Den 25. aug. 2023 kl. 10.23 skrev J.C. Cleaver <cleaver at terabithia.org>:
?
On Thu, August 24, 2023 19:42, Jeremy Laidman wrote: Hi y'all
We've setup some new RHEL 7.9 Xymon servers running 4.3.30-1 from the Terabithia repo. Everything seems to work fine except that the event history list ("last 50 events") has links that use the epoch time format for the individual events, but the files in $XYMONHISTLOGS directories contain files in the old format such as Fri_Aug_25_09:59:03_2023. When I click on an event dot, I'm given a webpage with the message "Historical status log not available". If I rename/symlink the file to the epoch time format (eg 1692921543) then it displays just fine.
I had to double-check this one, but I can confirm that on Terabithia (w/o any further configuration) the links generated look something like: http://127.0.0.1:2180/xymon-cgi/historylog.sh?HOST=trampoline.local&SERVICE=...
However the files in the directory are: [root at trampoline histlogs]# find -name Fri_Aug_25_00:55:58_2023 ./trampoline_local/cpu/Fri_Aug_25_00:55:58_2023 ./trampoline_local/disk/Fri_Aug_25_00:55:58_2023 ./trampoline_local/inode/Fri_Aug_25_00:55:58_2023 ./trampoline_local/procs/Fri_Aug_25_00:55:58_2023 ./trampoline_local/ports/Fri_Aug_25_00:55:58_2023 ./trampoline_local/msgs/Fri_Aug_25_00:55:58_2023 ./trampoline_local/files/Fri_Aug_25_00:55:58_2023 ./trampoline_local/memory/Fri_Aug_25_00:55:58_2023
And they are being pulled up correctly.
After adding --epochtimestamps to the xymond_history CMD: [history] ENVFILE /etc/xymon/xymonserver.cfg NEEDS xymond CMD xymond_channel --channel=stachg xymond_history --epochtimestamps LOGFILE $XYMONSERVERLOGS/history.log PIDFILE $XYMONRUNDIR/xymond_history.pid
We are now storing new records in epoch:
[root at trampoline histlogs]# ls -lt trampoline_local/memory/ total 20 -rw-r--r--. 1 xymon xymon 374 Aug 25 02:12 1692954767 -rw-r--r--. 1 xymon xymon 157 Aug 25 02:12 1692954733 -rw-r--r--. 1 xymon xymon 374 Aug 25 00:55 Fri_Aug_25_00:55:58_2023 -rw-r--r--. 1 xymon xymon 374 Aug 24 15:38 Thu_Aug_24_15:38:30_2023 -rw-r--r--. 1 xymon xymon 374 Aug 24 12:37 Thu_Aug_24_12:37:39_2023
However the main event and event history pages still seem to pull both up fine.
I'm aware of a patch in the Terabithia packages that change the filename format. But reading through the patch (from the SRPM), it looks like it's supposed to handle both old and new formats just fine. One of the things the patch provides is the ability to override the filename format using either the "--epochtimestamps" switch or setting the "EPOCHIST" variable, to (I think) tell "xymond_history" which format to use when creating new history logfiles. Without these, the code should assume write format, but read either format.
Correct. The intent was to use epoch for all *links* now, but not to adjust the default output file pattern unless flagged to. (This default would change in 4.4.) And yes, it needs to be able to read both for sure.
So, a couple of things here:
With the patch in place, new history log files should be created using the new filename format. But they're being created with the old format.
With the patch in place, existing history log files with filenames in the old format should still be readable by the CGI scripts. But they history.sh file is failing when it can't find filenames in the new format.
Running strings on the Xymon binaries doesn't show the strings "epochtimestamps" or "EPOCHHIST". So I don't think the patch has been fully applied. And I don't think I can use the --epochtimestamps option to fix it.
The only binary that I think the string will end up in is xymond_history itself. The various lib/*.c sources are only referenced there:
[root at trampoline histlogs]# grep epochtimestamps /usr/libexec/xymon/xymond_history Binary file /usr/libexec/xymon/xymond_history matches
If that's not showing up, then the patch definitely isn't applied.
I'm a little perplexed at what could be going on here, though. It's possible there is a missed bug in here somehow, but this functionality was added quite a while back.
The biggest drawback for the histlog format (in my mind) was lack of TZ data in it. Is it possible the epoch is not being converted correctly back somehow I wonder?
-jc
So I'm not sure if I've done something fundamentally wrong here (probably), or (less likely) if the patch in the SRPM has not cleanly applied when the the RPM was built. Does anyone else have the Terabithia 4.3.30-1 package installed on RHELv7.9, who can tell me what format your history status files (under /var/lib/xymon/histlog/<hostname>/<testname>) are in? And can tell me if the history events display correctly?
J
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
On Fri, August 25, 2023 03:28, Henrik St??rner wrote:
Hi, I am sure JC has already thought of this. But remember that the history-log clean up utility must also work with the new naming convention.
Regards, Henrik
Alas, I believe this was a miss of mine.
Fortunately, I believe https://sourceforge.net/p/xymon/code/HEAD/tree/branches/4.x-master/xymond/tr... prevents overtrimming for an "invalid" file name, but this does need to be added in properly. Thanks.
Jeremy: Might you be able to try adjusting a file name as +/- one hour? I'm wondering if DST calculations might be throwing the lookup off here.
Regards, -jc
On Sat, 26 Aug 2023 at 17:53, J.C. Cleaver <cleaver at terabithia.org> wrote:
Jeremy: Might you be able to try adjusting a file name as +/- one hour? I'm wondering if DST calculations might be throwing the lookup off here.
OK, so mystery solved. This was a self-inflicted injury. Sorry to waste peoples' time.
It turns out that some of the binaries installed from the Terabithia package had been replaced (but not by me) by binaries of the same code release (4.3.30) but compiled from source without any of the patches provided in the Terabithia .src.rpm file.
While I haven't yet reverted the binaries to the original, I have confirmed that this would likely fix the issue I was having. If I execute the original CGI binary (which was renamed to *.bak) with suitable QUERY_STRING then I get the correct results instead of the error message. And in particular, if I set the TIMEBUF to be the penultimate entry in the hist/<hostname>.<servicename> file, and run the svcstatus.cgi.bak file under strace, then I can see that it's trying the new format, and then reverting to the old format, in a way that matches the old-style filename in histlogs, which have names relative to my local timezone.
$ SCRIPT_NAME= REQUEST_METHOD=GET QUERY_STRING="HOST=$HOST&SERVICE=$SERVICE&TIMEBUF=$TIMEBUF" strace -f /usr/share/xymon/bin/svcstatus.cgi.bak --no-svcid --historical 2>&1 | grep histlogs stat("/var/lib/xymon/histlogs/<hostname-redacted>/procs/1693158425", 0x7ffcade28c80) = -1 ENOENT (No such file or directory) stat("/var/lib/xymon/histlogs/<hostname-redacted>/procs/Mon_Aug_28_03:47:05_2023", {st_mode=S_IFREG|0644, st_size=48102, ...}) = 0 open("/var/lib/xymon/histlogs/<hostname-redacted>/procs/Mon_Aug_28_03:47:05_2023", O_RDONLY) = 3
(Side note: if I don't define SCRIPT_NAME when running historylog.sh, I get a segfault in svcstatus.c/parse_query() line 154 at the SBUF_MALLOC.)
I wasn't aware of the substituted binary. Apparently, the reason for this was to work around a seg fault in do_request() that occurred when debugging was enabled. The work-around was to double the memory allocated for "clienturi" when history.sh creates its links to svcstatus.sh (from 1024 to 2048). I'll investigate this a bit more and might have a patch to apply one day soon...
Again, sorry for the false alarm, and thanks for helping me to narrow down the search.
J
participants (3)
-
cleaver@terabithia.org
-
henrik@hswn.dk
-
jeremy@laidman.org