On 8/28/2015 12:45 PM, John Thurston wrote:
On 6/10/2015 9:01 AM, Scot Kreienkamp wrote:
Hi everyone,
I have a xymon server running 4.3.21 that seems to be accumulating processes like these:
hobbit 28430 0.0 0.0 0 0 ? Z 12:50 0:00 [xymond_hostdata] <defunct>
hobbit 28435 0.0 0.0 0 0 ? Z 12:50 0:00 [xymond_hostdata] <defunct>
hobbit 28440 0.0 0.0 0 0 ? Z 12:50 0:00 [xymond_hostdata] <defunct>
hobbit 28444 0.0 0.0 0 0 ? Z 12:50 0:00 [xymond_hostdata] <defunct>
hobbit 28449 0.0 0.0 0 0 ? Z 12:50 0:00 [xymond_hostdata] <defunct>
hobbit 28452 0.0 0.0 0 0 ? Z 12:50 0:00 [xymond_hostdata] <defunct>
It seemed related to drop messages . . .
Hey, I think I'm seeing the same thing on Solaris with 4.3.21
I've ended up here after a customer let me know that email alerts were not working as expected. After a few hours of digging around, I decided that the alert daemon was failing to retrieve hostnames and failing miserably.
Have other people seen this behavior?
I have duplicated this behavior on another xymon server on Solaris. It certainly looks like this behavior breaks the alert daemon. Fortunately, I "drop" hosts in batches so can restart Xymon at that time, but this is still pretty icky.
J.C., do you know if your patch made it into the code-base?
Has anyone else tested this patch? If so, on what operating systems?
-- Do things because you should, not just because you can.
John Thurston 907-465-8591 John.Thurston at alaska.gov Enterprise Technology Services Department of Administration State of Alaska