Well...We think it's a big bug, where 'we' is me and RedHat support. Of course I'm speaking of Linux and not about the Solaris bug, and my kernel parameter are ok.
I moved from a rhel4.5 with kernel 2.6.9-55 to a rhel5.3 with kernel 2.6.18-128 with bonding (active-passive) gigabit ethernet, and nfs files storing the xymon data in a Veritas cluster. The xymon server get 3000 hosts and about 17093 status messages. The problem is...the timeout, the hobbit status page go in green, the pages sometimes are slow to be read or give a "Status not available"
Speaking with Redhat premium support, I sent them a trace of the error (about 40MB gzip...) and for them the cause is a bug in the thread management cause in the RHEL5 is not more possible to use the old POSIX implementation of threading, but needs to use just the Linux Threading "version". Of course I have lost some of the sentences....sorry but I'm not a programmer. They avoid at all a problem with the nfs share, the throughput of xymon is about a stable 30KB/s, while network test indicate a possibility of 50-78MB/s. However I had to modify the mount option to avoid many setattr calls.
As a workaround I have modify the sendmessage call in lib folder adding to repeat the send of message: if (res == BB_ETIMEOUT) { usleep(5); res = sendtomany((recipient ? recipient : bbdisp), xgetenv("BBDISPLAYS"), msg, respfd, respstr, fullresponse, timeout); } This of course increase the busy time but doesn't get again an "all system green" problem. I'm running a xymon 4.2.0 with allinonepatch and xymon 4.2.2 doesn't seem to have any changes in this problem however I'll try in the next days. Other issue...shutting down xymon I always need to clear all with ipcrm cause segments are yet present. Nothing more in logs, just the status-board not available.
If someone already got this issue (doesn't seem in the past posts) please give me a tip.... Ah..here my kernel parameter:
------ Shared Memory Limits -------- max number of segments = 8192 max seg size (kbytes) = 67108864 max total shared memory (kbytes) = 17179869184 min seg size (bytes) = 1
------ Semaphore Limits -------- max number of arrays = 128 max semaphores per array = 250 max semaphores system wide = 32000 max ops per semop call = 100 semaphore max value = 32767
------ Messages: Limits -------- max queues system wide = 16 max size of message (bytes) = 65536 default max size of queue (bytes) = 65536
Thanks in advance.
-- Be Yourself @ mail.com! Choose From 200+ Email Addresses Get a Free Account at www.mail.com