Hi Jeremy
As much as I like the idea of modifying the code and recompiling, we need to remember, this is a production system. I get frowned upon when I add --debug just to create a core dump. :-(
I doubt changing source code and recompiling is going to get a green light.
As for your question about XYMONNETSVC, it's not a variable that's defined anywhere in my config. etc]# grep XYMONNETSVC * return nothing.
Regards Vernon
On 11 March 2015 at 09:52, Jeremy Laidman <jlaidman at rebel-it.com.au> wrote:
On 11 March 2015 at 11:37, Vernon Everett <everett.vernon at gmail.com> wrote:
And even with --no-cache, I am still getting these corrupted rrd files.
:-(
I tried again with --debug (and --no-cache) and it core dumps.
Here's the backtrace.
libc.so.1`vfprintf+0xec(6c3d0, 514c0, ffbfb3e8, 0, a0ba4, 33e1c) dbgprintf+0xa4(514c0, 0, 51400, 6c3f0, bf, 2ab388) dump_tcp_services+0x74(a0, 1c00, fef37940, 0, 51400, 51400)
So dump_tcp_services() calls dbgprintf() (both on lib/netservices.c) which in turn calls vprintf() from libc, but with bad parameters. I've had a look through the code in dump_tcp_services() and I don't know enough C to recognize any problems. But it might be useful to know which call to dbgprintf() is causing the problem.
Does the log file for xymond_rrd show any debug output at all? If so, what's the last line that is shown.
It might be helpful if you can recompile xymond_rrd with dump_tcp_services() modified. Initially, I would simply try it with "return" added after the first call to dbgprintf(). That is, dump_tcp_services() will output "Service list dump" and return. This might stop the core dumps so that we can get debug output for other parts of the xymond_rrd processing.
If adding "return" at that point fixes this core dump, more diagnostic lines would be useful to determine what the problem is. For example, there's a global array called svcinfo that is iterated over, but if the array is empty, it might cause the core dump. So adding a line that checks whether the array is empty and displays the result would help to pin this down.
Note that "svcinfo" appears to be populated from the protocols.cfg file and/or XYMONNETSVCS. Is it possible that your protocols.cfg file is empty, or has some syntax error that causes it to be unparseable? The same for XYMONNETSVCS (in xymonserver.cfg)?
J
-- "Accept the challenges so that you can feel the exhilaration of victory"
- General George Patton