You might consider putting a process check for df on the affected servers. They actually will build up so you could alert on that.
=G=
From: Xymon <xymon-bounces at xymon.com> on behalf of Cédric BRINER <Cedric.BRINER at UniGE.ch> Sent: Friday, June 19, 2015 4:24 AM To: Mark Felder; xymon at xymon.com Subject: Re: [Xymon] df hanging will cause xymon-client to hang.
The error happend due to a nfs ressource not responding. I suppose that as xymon launch "df" to get information and as the nfs was hanging, the xymon-client is no able to detect that the df hangs and it does not send any data to the server. Worst, the xymon-client is not verbose at all, it does not tell this test (df) takes too long to accomplish.
What I found very disturbing is that there is no logs at all saying that df takes long to accomplish. Instead of finding a way to solve this xymon-client hanging out, could we somehow let xymon-client write a message on the log, saying that df did not return since a long time ?
cED
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon