bbtest rrd file isn't getting updated anymore
Yesterday afternoon my bbtest rrd file stopped getting updated with new data, even though the bbtest-net test is running fine and the latest status update is displayed under the bbtest column. All of the other rrd files for my hobbit server are updating just fine and do not have any issues.
I added --debug to the CMD line for the rrdstatus module in hobbitlauch.cfg, but nothing bbtest related is reported in the rrd-status.log.
The bbtest data shows up fine when I watch the status channel, which make sense since the data is making it the webpage.
What can I check next? Seems like there is a breakdown between what bbtest data goes into the status channel and what is rrdstatus is getting from the the status channel, but I don't know how to drill into that area.
Thanks, Tom
On Thu, Sep 27, 2007 at 07:50:38AM -0400, Tom Georgoulias wrote:
Yesterday afternoon my bbtest rrd file stopped getting updated
Could you check the timestamp and permissions on the bbtest.rrd file? And the rrd-status.log file.
Regards, Henrik
Henrik Stoerner wrote:
On Thu, Sep 27, 2007 at 07:50:38AM -0400, Tom Georgoulias wrote:
Yesterday afternoon my bbtest rrd file stopped getting updated
Could you check the timestamp and permissions on the bbtest.rrd file? And the rrd-status.log file.
Clipped the hostname from my paths:
[hobbit at radm000p server]$ ls -ld ~/data/rrd/<>/bbtest.rrd -rw-r--r-- 1 hobbit hobbit 19548 Sep 27 09:43 /home/hobbit/data/rrd/<>/bbtest.rrd
Nothing bbtest related in rrd-status.log [hobbit@<> server]$ grep bbtest ~/log/rrd-status.log [hobbit@<> log]$ ls -ld rrd-status.log -rw-rw-r-- 1 hobbit hobbit 9518262 Sep 27 11:38 rrd-status.log [hobbit@<> log]$
Stopped and restarted hobbit, and now I have core files in my ~/server dir. Want one or something done to one?
-rw------- 1 hobbit hobbit 4628480 Sep 27 11:13 core.1646 -rw------- 1 hobbit hobbit 4628480 Sep 27 11:18 core.2711 -rw------- 1 hobbit hobbit 4534272 Sep 27 10:58 core.31625 -rw------- 1 hobbit hobbit 4628480 Sep 27 11:03 core.32027 -rw------- 1 hobbit hobbit 4632576 Sep 27 11:23 core.3812 -rw------- 1 hobbit hobbit 4534272 Sep 27 11:28 core.5832 -rw------- 1 hobbit hobbit 4628480 Sep 27 11:33 core.5847 -rw------- 1 hobbit hobbit 4628480 Sep 27 11:07 core.624 -rw------- 1 hobbit hobbit 4628480 Sep 27 11:38 core.6972
rrd-status and hobbitlaunch entries:
[hobbit@<> log]$ tail rrd-status.log 2007-09-27 10:54:23 Worker process died with exit code 139, terminating 2007-09-27 10:58:05 Worker process died with exit code 139, terminating 2007-09-27 11:03:01 Worker process died with exit code 139, terminating 2007-09-27 11:07:54 Worker process died with exit code 139, terminating 2007-09-27 11:13:02 Worker process died with exit code 139, terminating 2007-09-27 11:18:01 Worker process died with exit code 139, terminating 2007-09-27 11:23:09 Worker process died with exit code 139, terminating 2007-09-27 11:28:15 Worker process died with exit code 139, terminating 2007-09-27 11:33:11 Worker process died with exit code 139, terminating 2007-09-27 11:38:18 Worker process died with exit code 139, terminating [hobbit@<> log]$ tail hobbitlaunch.log 2007-09-27 10:57:09 Setting up logfiles 2007-09-27 10:58:05 Task rrdstatus terminated, status 1 2007-09-27 11:03:01 Task rrdstatus terminated, status 1 2007-09-27 11:07:54 Task rrdstatus terminated, status 1 2007-09-27 11:13:02 Task rrdstatus terminated, status 1 2007-09-27 11:18:01 Task rrdstatus terminated, status 1 2007-09-27 11:23:09 Task rrdstatus terminated, status 1 2007-09-27 11:28:15 Task rrdstatus terminated, status 1 2007-09-27 11:33:11 Task rrdstatus terminated, status 1 2007-09-27 11:38:18 Task rrdstatus terminated, status 1
On Thu, Sep 27, 2007 at 11:44:23AM -0400, Tom Georgoulias wrote:
Stopped and restarted hobbit, and now I have core files in my ~/server dir. Want one or something done to one?
Seems like your rrdstatus task dies once every 5 minutes, which would match the times that your network tests run. Could you run one of the core files through gdb (see the "Reporting bugs" section in the Help menu)? Also, please grab the bbtest report with
bb 127.0.0.1 "hobbitdlog YOURHOBBITSERVER.bbtest" >bbtest.txt
and send me the bbtest.txt file (attached,please - to avoid it getting messed up by the mail program).
Regards, Henrik
Henrik Stoerner wrote:
On Thu, Sep 27, 2007 at 11:44:23AM -0400, Tom Georgoulias wrote:
Stopped and restarted hobbit, and now I have core files in my ~/server dir. Want one or something done to one?
Seems like your rrdstatus task dies once every 5 minutes, which would match the times that your network tests run. Could you run one of the core files through gdb (see the "Reporting bugs" section in the Help menu)? Also, please grab the bbtest report with
bb 127.0.0.1 "hobbitdlog YOURHOBBITSERVER.bbtest" >bbtest.txt
and send me the bbtest.txt file (attached,please - to avoid it getting messed up by the mail program).
I will send you this data off list, because there is some sensitive info exposed in the backtrace and the http test that is causing the failure.
When I remove that http test from bb-hosts, everything is back to normal. So I know what is breaking it.
Tom
participants (2)
-
henrik@hswn.dk
-
tomg@mcclatchyinteractive.com