On Tue, August 26, 2014 11:04 am, John Thurston wrote:
I'm having difficulty with my RRD handlers crashing and leaving gaps in my databases.
I mentioned this back in April, 2014 but received no responses: http://lists.xymon.com/pipermail/xymon/2014-April/039547.html
Since that time, I stumbled across a client which was occasionally sending me empty messages of the "data" type. When I stopped that behavior, the behavior was corrected. The behavior has returned and I am not able to figure out how to find the cause.
I _suspect_ it is another client sending me empty messages, but how do I find it now that I have several hundred clients sending "data" messages?
As an initial step, run xymond_rrd in --debug mode... You can send the pid a -USR2 signal to toggle this setting without bouncing the process itself (be sure you're signalling xymond_rrd and not its xymond_channel parent).
Ideally that can help you see what exactly it was processing before it went down. If it's a full on crash, a backtrace from a core file would be helpful for debugging after a segfault or something similar.
HTH, -jc