This is a shot in the dark - I had a lockup problem on a BB system where the kernel was compiled with a different version compiler than was included with the system and there were some IPC-related changes in compiler versions.
-----Original Message----- From: Stefan Loos [mailto:stefan_loos at hotmail.com] Sent: Monday, July 11, 2005 7:12 AM To: hobbit at hswn.dk Subject: Re: [hobbit] Status Unavailable - again
Hi Henrik,
we have several own-written scripts (mostly in perl) which monitor oracle instances, bea weblogic servers. There is one for hardware monitoring - HP (Intel) and Sun Servers (prtdiag based) and some are just the output of a http request to the software running on that weblogic servers. I have just one server for testing, it's a HP DL 360. We are running Redhat Enterprise Server 3 but I've tried it with a SuSE 9.3 too.
Regards,
Stefan
<br><br><br>>From: henrik at hswn.dk (Henrik Stoerner)<br>>Reply-To: hobbit at hswn.dk<br>>To: hobbit at hswn.dk<br>>Subject: Re: [hobbit] Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 13:08:59 +0200<br>><br>>Hi Stefan,<br>><br>>On Mon, Jul 11, 2005 at 09:36:42AM +0000, Stefan Loos wrote:<br>> > It would be great if you can put me in cc. If you want I can try to assist<br>> > you. I'm at a point where I don't know what to try anymore. I think it<br>> > isn't easy for Henrik to find this issue - I have no coredumps and nothing<br>> > in the logfile what could help.<br>><br>>Yes, this is a really nasty problem. Vernon and I though we had it<br>>nailed down by the end of last week, but there's more to it than<br>>what we found then.<br>><br>>What kind of external scripts run on your clients, apart from the BB<br>>client ? The current suspicion is that this is triggered by a status<br>>message that is handled badly by Hobbit causing this lock-up. So I'm<br>>trying to see if there might be something in
common between your setups.<br>><br>>And what kind of system are you running Hobbit on ? If Linux, which<br>>distribution ? Another suspicion I have is that this might be a<br>>problem with the implementation of
SysV IPC semaphores.<br>><br>><br>>Regards,<br>>Henrik<br>><br>> ;<br>>To unsubscribe from the hobbit list, send an e-mail to<br>>hobbit-unsubscribe at hswn.dk<br>><br>><br>
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hello Jeffery,
do you think we see this issue because the kernel was built with a different compiler than I was using for building the hobbit server?
@Henrik - do you think this could be the problem?
Regards, Stefan
<br><br><br>>From: "Reif Jeffery M" <ReifJefferyM at JohnDeere.com><br>>Reply-To: hobbit at hswn.dk<br>>To: hobbit at hswn.dk<br>>Subject: RE: [hobbit] Status Unavailable - again<br>>Date: Mon, 11 Jul 2005 07:20:12 -0500<br>><br>>This is a shot in the dark - I had a lockup problem on a BB system where<br>>the kernel was compiled with a different version compiler than was<br>>included with the system and there were some IPC-related changes in<br>>compiler versions.<br>><br>>-----Original Message-----<br>>From: Stefan Loos [mailto:stefan_loos at hotmail.com]<br>>Sent: Monday, July 11, 2005 7:12 AM<br>>To: hobbit at hswn.dk<br>>Subject: Re: [hobbit] Status Unavailable
- again<br>><br>>Hi Henrik,<br>><br>>we have several own-written scripts (mostly in perl) which monitor<br>>oracle<br>>instances, bea weblogic servers. There is one for hardware monitoring -<br>>HP<br>>(Intel) and Sun Servers (prtdiag based) and some are just the output of<br>>a<br>>http request to the software running on that weblogic servers.<br>>I have just one server for testing, it's a HP DL
- We are running<br>>Redhat<br>>Enterprise Server 3 but I've tried it with a SuSE 9.3 too.<br>><br>>Regards,<br>><br>>Stefan<br>><br>><br><br><br>>From: henrik at hswn.dk (Henrik Stoerner)<br>>Reply-To:<br>>hobbit at hswn.dk<br>>To: hobbit at hswn.dk<br>>Subject: Re: [hobbit]<br>>Status<br>>Unavailable - again<br>>Date: Mon, 11 Jul 2005 13:08:59<br>>+0200<br>><br>>Hi Stefan,<br>><br>>On Mon, Jul 11, 2005 at<br>>09:36:42AM +0000, Stefan Loos wrote:<br>> > It would be great if<br>>you<br>>can put me in cc. If you want I can try to assist<br>> > you. I'm<br>>at a<br>>point where I don't know what to try anymore. I think it<br>> ><br>>isn't<br>>easy for Henrik to find this issue - I have no coredumps and<br>>nothing<br>><br>>> in the logfile what could help.<br>><br>>Yes, this is a<br>>really<br>>nasty problem. Vernon and I though we had it<br>>nailed down by the<br>>end<br>>of last week, but there's more to it than<br>>what we found<br>>then.<br>><br>>What kind of external scripts run on your clients,<br>>apart from the BB<br>>client ? The current suspicion is that this is<br>>triggered by a status<br>>message that is handled badly by Hobbit<br>>causing<br>>this lock-up. So I'm<br>>trying to see if there might be something in<br>><br>>common between your setups.<br>><br>>And what kind of system are<br>>you<br>>running Hobbit on ? If Linux, which<br>>distribution ? Another<br>>suspicion<br>>I have is that this might be a<br>>problem with the implementation of<br>><br>>SysV IPC<br>>semaphores.<br>><br>><br>>Regards,<br>>Henrik<br>><br>><br>>;<br>>To<br>>unsubscribe from the hobbit list, send an e-mail<br>>to<br>>hobbit-unsubscribe at hswn.dk<br>><br>><br><br>><br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>hobbit-unsubscribe at hswn.dk<br>><br>><br>><br>><br>><br>>To unsubscribe from the hobbit list, send an e-mail to<br>>hobbit-unsubscribe at hswn.dk<br>><br>><br>
In <BAY107-F116EA789812F70D2A4881884DC0 at phx.gbl> "Stefan Loos" <stefan_loos at hotmail.com> writes:
do you think we see this issue because the kernel was built with a different compiler than I was using for building the hobbit server?
@Henrik - do you think this could be the problem?
No, I dont think so. Compiler versions should not matter, in the end it's just binary code.
One thing I do have in mind as a potential source of this problem is the fact that newer Linux distributions tend to stuff all sorts of new "scalability" features into their kernels and libc libraries. This could mean that they come with versions of the kernel and/or libraries that have bugs which Hobbit happens to trigger - some of the features that Hobbit uses are not terribly common for applications, so there could be bugs that just haven't been discovered yet.
But for now, let's assume that the bug is in Hobbit (until we can prove otherwise).
Henrik
participants (3)
-
henrik@hswn.dk
-
ReifJefferyM@JohnDeere.com
-
stefan_loos@hotmail.com