On 26/02/13 8:26 PM, Adam Goryachev wrote:
On 26/02/13 19:47, Neil Simmonds wrote:
Hi all,
I’ve got a strange problem that I’m trying to diagnose and would appreciate any help you can give.
We have 2 new servers that have recently been set up that are Aix servers running the hobbit client. We have 62 other Aix server with the same client running absolutely fine.
The problem is that the client data is getting cut off mid stream. It’s always in the ps output. I’ve checked the MAX settings and there all ok, in fact we have other clients that are sending data files larger than these that are working fine. I’ve checked the data on the client and it’s complete but if I look in /xymon/data/hostdata on the server the data seems to be almost always getting truncated to 69518 bytes. Occasionally a full message (approx 93k) gets through.
There are no messages regarding truncated data in the server logs and the only message I can find on the client is the following,
2013-02-26 08:41:21 Write error while sending message to bbd at xymonserver:1984
2013-02-26 08:41:21 Whoops ! bb failed to send message - write error
I’ve googled this extensively and can’t find anything that seems relevant to our problem.
I get this from time to time, primarily when the xymon host has very limited bandwidth. It seems to me that Xymon will accept whatever data has been received prior to the connection being broken/interrupted, and pretend it is complete (as opposed to discarding it away).
The problem is that there isn't a well defined "end of message" on a standard client report. The message starts with "client HOSTNAME.OS CLASS" line then consists of a bunch of sections starting with "[section]" lines followed by lines of text. When the client has finished sending its message it just does a shutdown on the write socket and reads any returned data until EOF. That's it. The server probably doesn't care if the client even reads the data it sends back, and has no way of communicating with it anyway.
So if the client connection to the server is interrupted mid-stream, the server quite probably just handles it as a socket shutdown and accepts whatever has been received so far as the whole message.
If this is happening frequently/all the time, I would suspect firewall settings, and/or MTU issues (if it is packet size related). Check that you are not blocking all ICMP, or that path MTU discovery is working properly, check any firewall is not timing out or blocking the connection for some reason, and that there is enough bandwidth for the messages.
Potentially, a tcpdump at both client and server could be educational, possibly load these into wireshark for analysis.
PS, I wonder when we will get compression, and/or encryption for the status messages? Both would assist in making sure the complete message arrives un-altered...
Indeed. There are other ways of delivering/fetching messages - maybe worth exploring for more reliable transmission.
David.
Regards, Adam
-- Adam Goryachev Website Managers www.websitemanagers.com.au
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
-- David Baldwin - Senior Systems Administrator (Datacentres + Networks) Information and Communication Technology Services Australian Sports Commission http://ausport.gov.au Tel 02 62147830 Fax 02 62141830 PO Box 176 Belconnen ACT 2616 david.baldwin at ausport.gov.au Leverrier Street Bruce ACT 2617
Keep up to date with what's happening in Australian sport visit http://www.ausport.gov.au