hobbitd_rrd is not looking good

bob＠phreakout.net

11 Nov 2005 11 Nov '05

8:58 a.m.

I have no real idea what I broke, but maybe someone can tell me. I made a change in my bb-services file as I was trying to define a different smtp service that expected a different return value than default smtp.
I called it something atypical and things started to turn red and when I clicked on the red faces, there were Internal Server Error messages.
About that same time, the hobbitd and hobbitd_rrd turned red. I immediately changed the values back to where they where(removed the additional smtp definition as well as removed the the reference in bb-hosts) and then restarted hobbit. The hobbitd_rrd does not seem to be coming back.

So right now, hobbitd_rrd is purple and when you click to get more information, it says 'Program Crashed Fatal Signal Caught' In the rrd-data.log at around the time that this happened there is an entry that says, '2005-11-11 01:33:35 Worker process died with exit code 134, terminating'. I am not sure if that is related. I do not seem to have any COREFILES in my tmp dir unless they may have been erased when I restarted hobbit....probably not, but I don't know.

I am also as green(not the hobbit green!) as possible in terms of running and even configuring hobbit, so go easy on me. If anyone has any idea of what I can do to bring my poor hobbitd_rrd back to a nice shade of green, please do! TIA

peace, Bob

Show replies by date

henrik＠hswn.dk

11 Nov 11 Nov

1:10 p.m.

New subject: [hobbit] hobbitd_rrd is not looking good

On Fri, Nov 11, 2005 at 03:58:20AM -0500, Bob Ababurko wrote:

...

I have no real idea what I broke, but maybe someone can tell me. I made a change in my bb-services file as I was trying to define a different smtp service that expected a different return value than default smtp.
I called it something atypical and things started to turn red and when I clicked on the red faces, there were Internal Server Error messages.

Changing bb-services should not break things like that, so I'm pretty sure this is a coincidence. Or at least - you're not to blame for hobbitd crashing :-)

...

About that same time, the hobbitd and hobbitd_rrd turned red. I immediately changed the values back to where they where(removed the additional smtp definition as well as removed the the reference in bb-hosts) and then restarted hobbit. The hobbitd_rrd does not seem to be coming back.

So right now, hobbitd_rrd is purple and when you click to get more information, it says 'Program Crashed Fatal Signal Caught'

If it is purple, it is safe to remove it with the command bb 127.0.0.1 "drop HOBBIT.SERVER.HOSTNAME hobbitd_rrd"

The reason it ends up being purple is because normally hobbitd_rrd will not generate any status column. The only time it does is when it crashes; it's a kind of "Mayday" signal to make sure you notice that something bad has happened, and alert me to this.

You can always check the "ps" listing and see if there are any hobbitd_rrd processes running - a standard install will have two of them, plus two hobbitd_channel processes feeding them.

henrik at osiris:~$ ps -u hobbit PID TTY TIME CMD 10756 ? 00:00:00 hobbitlaunch 10757 ? 00:02:00 hobbitd 10762 ? 00:00:07 hobbitd_channel 10763 ? 00:00:01 hobbitd_filestore 10764 ? 00:00:00 hobbitd_channel 10765 ? 00:00:01 hobbitd_channel 10776 ? 00:00:00 hobbitd_alert 10778 ? 00:00:00 hobbitd_history 11581 ? 00:00:07 hobbitd_channel 11582 ? 00:00:05 hobbitd_rrd 11583 ? 00:00:01 hobbitd_channel 11699 ? 00:00:01 hobbitd_channel 11700 ? 00:00:05 hobbitd_client 11584 ? 00:00:02 hobbitd_rrd 21402 ? 00:00:00 sh 21403 ? 00:00:00 vmstat

...

rrd-data.log at around the time that this happened there is an entry that says, '2005-11-11 01:33:35 Worker process died with exit code 134, terminating'. I am not sure if that is related. I do not seem to have any COREFILES in my tmp dir unless they may have been erased when I restarted hobbit....probably not, but I don't know.

They are not erased automatically, so they ought to be there ... could you run a find ~hobbit -name "core*" and see if anything shows up ?

Regards, Henrik

bob＠phreakout.net

6:56 p.m.

New subject: [hobbit] hobbitd_rrd is not looking good

Henrik Stoerner wrote:

...

On Fri, Nov 11, 2005 at 03:58:20AM -0500, Bob Ababurko wrote:

...
I have no real idea what I broke, but maybe someone can tell me. I made a change in my bb-services file as I was trying to define a different smtp service that expected a different return value than default smtp.
I called it something atypical and things started to turn red and when I clicked on the red faces, there were Internal Server Error messages.

Changing bb-services should not break things like that, so I'm pretty sure this is a coincidence. Or at least - you're not to blame for hobbitd crashing :-)

...
About that same time, the hobbitd and hobbitd_rrd turned red. I immediately changed the values back to where they where(removed the additional smtp definition as well as removed the the reference in bb-hosts) and then restarted hobbit. The hobbitd_rrd does not seem to be coming back.

So right now, hobbitd_rrd is purple and when you click to get more information, it says 'Program Crashed Fatal Signal Caught'

If it is purple, it is safe to remove it with the command bb 127.0.0.1 "drop HOBBIT.SERVER.HOSTNAME hobbitd_rrd"

The reason it ends up being purple is because normally hobbitd_rrd will not generate any status column. The only time it does is when it crashes; it's a kind of "Mayday" signal to make sure you notice that something bad has happened, and alert me to this.

You can always check the "ps" listing and see if there are any hobbitd_rrd processes running - a standard install will have two of them, plus two hobbitd_channel processes feeding them.

henrik at osiris:~$ ps -u hobbit PID TTY TIME CMD 10756 ? 00:00:00 hobbitlaunch 10757 ? 00:02:00 hobbitd 10762 ? 00:00:07 hobbitd_channel 10763 ? 00:00:01 hobbitd_filestore 10764 ? 00:00:00 hobbitd_channel 10765 ? 00:00:01 hobbitd_channel 10776 ? 00:00:00 hobbitd_alert 10778 ? 00:00:00 hobbitd_history 11581 ? 00:00:07 hobbitd_channel 11582 ? 00:00:05 hobbitd_rrd 11583 ? 00:00:01 hobbitd_channel 11699 ? 00:00:01 hobbitd_channel 11700 ? 00:00:05 hobbitd_client 11584 ? 00:00:02 hobbitd_rrd 21402 ? 00:00:00 sh 21403 ? 00:00:00 vmstat

...
rrd-data.log at around the time that this happened there is an entry that says, '2005-11-11 01:33:35 Worker process died with exit code 134, terminating'. I am not sure if that is related. I do not seem to have any COREFILES in my tmp dir unless they may have been erased when I restarted hobbit....probably not, but I don't know.

They are not erased automatically, so they ought to be there ... could you run a find ~hobbit -name "core*" and see if anything shows up ?

Regards, Henrik

To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk

Ok, maybe I have gotten mixed in the name of expected corefile. I DO have a file that is called hobbitd_rrd.core. Now, it looks like it was created at the time of 'the incident'....so I thin what I am looking for. I was actually looking for something called COREFILE. I must have misread. Ok, now I cannot seem to find the web page that showed what to do to review a corefile in tmp. Does anyone know what I should be doing to to read these files.

Now, is taking hobbitd_rrd out of the monitoring checks what I want?
Dont I want/need it in there for a complete hobbit.....fixed, of course? I want my hobbit to be a healthy and fully funtional hobbit. I guess I am curious what went wrong and how to fix it. Maybe this has something to do with the COREFILE.....which I need to fugure out how to read so I can figure out why it crashed. Am I right here? Sounds logical.

I do have two hobbitd_rrd processes running, but I only checked for two after removing the hobbitd_rrd from being checked. I actually do not remember seeing two last night, but it is dsitinctly possible.

-Bob

henrik＠hswn.dk

10:33 p.m.

New subject: [hobbit] hobbitd_rrd is not looking good

On Fri, Nov 11, 2005 at 01:56:14PM -0500, Bob Ababurko wrote:

...

Ok, maybe I have gotten mixed in the name of expected corefile. I DO have a file that is called hobbitd_rrd.core. Now, it looks like it was created at the time of 'the incident'....so I thin what I am looking for. I was actually looking for something called COREFILE.

Traditionally, those files are called just "core". But it is not uncommon - especially on some Linux distributions - to redefine this so that core-files have a filename the connects them to the program they were generated by.

...

Ok, now I cannot seem to find the web page that showed what to do to review a corefile in tmp. Does anyone know what I should be doing to to read these files.

http://www.hswn.dk/hobbit/help/known-issues.html#bugreport

...

Now, is taking hobbitd_rrd out of the monitoring checks what I want?

Yes. You do want to run the hobbitd_rrd module, but there really is no monitoring of this module - except when it has crashed. It doesn't send in any "I am OK" message when it runs normally. So once you've noticed that it has crashed, you can remove the purple hobbitd_rrd status.

...

Dont I want/need it in there for a complete hobbit.....fixed, of course? I want my hobbit to be a healthy and fully funtional hobbit. I guess I am curious what went wrong and how to fix it. Maybe this has something to do with the COREFILE.....which I need to fugure out how to read so I can figure out why it crashed. Am I right here? Sounds logical.

The reason why it crashed is some programming error. What you can do to help me fix it is to pinpoint exactly where it did crash; the link above will help you get the information I need for the debugging.

Regards, Henrik

bob＠phreakout.net

12 Nov 12 Nov

4:35 a.m.

New subject: [hobbit] hobbitd_rrd is not looking good(Bug Report)

Henrik Stoerner wrote:

...

On Fri, Nov 11, 2005 at 01:56:14PM -0500, Bob Ababurko wrote:

...
Ok, maybe I have gotten mixed in the name of expected corefile. I DO have a file that is called hobbitd_rrd.core. Now, it looks like it was created at the time of 'the incident'....so I thin what I am looking for. I was actually looking for something called COREFILE.

Traditionally, those files are called just "core". But it is not uncommon - especially on some Linux distributions - to redefine this so that core-files have a filename the connects them to the program they were generated by.

...
Ok, now I cannot seem to find the web page that showed what to do to review a corefile in tmp. Does anyone know what I should be doing to to read these files.

http://www.hswn.dk/hobbit/help/known-issues.html#bugreport

...
Now, is taking hobbitd_rrd out of the monitoring checks what I want?

Yes. You do want to run the hobbitd_rrd module, but there really is no monitoring of this module - except when it has crashed. It doesn't send in any "I am OK" message when it runs normally. So once you've noticed that it has crashed, you can remove the purple hobbitd_rrd status.

...
Dont I want/need it in there for a complete hobbit.....fixed, of course? I want my hobbit to be a healthy and fully funtional hobbit. I guess I am curious what went wrong and how to fix it. Maybe this has something to do with the COREFILE.....which I need to fugure out how to read so I can figure out why it crashed. Am I right here? Sounds logical.

The reason why it crashed is some programming error. What you can do to help me fix it is to pinpoint exactly where it did crash; the link above will help you get the information I need for the debugging.

Regards, Henrik

To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk

Henrik

I hope that this will be of use. The version of hobbit is 4.1.2. This is the output that you require:

/var/www/htdocs/hobbit/server/tmp $gdb ../bin/hobbitd_rrd ./hobbitd_rrd.core

GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "sparc64-unknown-openbsd3.8"... Core was generated by `hobbitd_rrd'. Program terminated with signal 6, Aborted. Reading symbols from /usr/local/lib/librrd.so.0.0...done. Loaded symbols for /usr/local/lib/librrd.so.0.0 Reading symbols from /usr/local/lib/libpng.so.4.2...done. Loaded symbols for /usr/local/lib/libpng.so.4.2 Reading symbols from /usr/local/lib/libpcre.so.0.1...done. Loaded symbols for /usr/local/lib/libpcre.so.0.1 Reading symbols from /usr/lib/libc.so.38.2...done. Loaded symbols for /usr/lib/libc.so.38.2 Reading symbols from /usr/lib/libz.so.4.1...done. Loaded symbols for /usr/lib/libz.so.4.1 Reading symbols from /usr/local/lib/libgd.so.18.3...done. Loaded symbols for /usr/local/lib/libgd.so.18.3 Reading symbols from /usr/local/lib/libjpeg.so.62.0...done. Loaded symbols for /usr/local/lib/libjpeg.so.62.0 Reading symbols from /usr/local/lib/libttf.so.1.3...done. Loaded symbols for /usr/local/lib/libttf.so.1.3 Reading symbols from /usr/lib/libm.so.2.0...done. Loaded symbols for /usr/lib/libm.so.2.0 Reading symbols from /usr/libexec/ld.so...done. Loaded symbols for /usr/libexec/ld.so #0 0x000000004561e480 in abort () from /usr/lib/libc.so.38.2 (gdb) bt #0 0x000000004561e480 in abort () from /usr/lib/libc.so.38.2 #1 0x0000000000110a20 in sigsegv_handler (signum=11) at sig.c:57 #2 0x000000004ff6a084 in ?? () #3 0x000000004ff6a084 in ?? () Previous frame identical to this frame (corrupt stack?)

Regards, Bob

7529

Age (days ago)

7530

Last active (days ago)

List overview

Download

4 comments

2 participants

participants (2)

bob＠phreakout.net
henrik＠hswn.dk