On Fri, Jun 09, 2006 at 04:21:56PM -0500, Larry Barber wrote:
I loaded p1, and hobbitd_rrd is still dumping, the stack trace looks like:
#5 <signal handler called> #6 0x00dfe3da in do_lookup () from /lib/ld-linux.so.2 #7 0x00dfd103 in _dl_lookup_symbol_internal () from /lib/ld-linux.so.2 #8 0x00e0140f in fixup () from /lib/ld-linux.so.2 #9 0x00e01330 in _dl_runtime_resolve () from /lib/ld-linux.so.2 #10 0x0804a91f in create_and_update_rrd (hostname=0xb755d037 "stellent_pre-prod_v-ip", fn=0x805f6e0 "tcp.http.https:,,pws.tc.sc.egov.usda.gov,siteminderagent,dmsforms,login_banner.fcc?TYPE=33554433&REALMOID=06-d38f4375-a8bd-4190-b6f9-3c77f0901647&GUID=&SMAUTHREASON=0&METHOD=GET&SMAGENTNAME=$SM$hIspF3"..., creparams=0x805e5c0, template=0x9cf6b20 "sec") at do_rrd.c:143
OK, the call trace looks sane so I think we can rule out simple memory corruption here.
The crash happens when trying to print an error-message from the RRDtool library, when trying to create a new RRD file for tracking a http test response time (it has just called the rrd_create() function, which returns an error and hobbit is trying to print out the error message when it crashes.
The filename looks somewhat suspicious. It is generated from the URL that is tested, and it is a very long filename beginning with "tcp.http.https:,,pws.tc.sc.egov.usda.gov,siteminderagent,dmsforms,login_banner.fcc?TYPE=" It's an http test for the host "stellent_pre-prod_v-ip"
My guess is that this filename is just too long. It *could* overflow the buffer set aside for the RRD filename - in that case, the attached patch against 4.1.2p1 should help.
It just started doing this today, I can't think of anything that I have done that could cause it.
I think You just added this http test for "stellent_pre-prod_v-ip".
Regards, Henrik