This is getting really annoying.
I added another DOWNTIME setting for a different machine.
This morning I came in early to find the system had died again.
The only entries I'm finding in the logs are the history.log that shows unable to access the historyfile for any of the systems. Restarting hobbit does the trick, and it always happens right around when an enable or disable is scheduled via downtime.
The error makes it seem like the files are just flat locked, I don't know if there's something in hobbit locking them all open or what.
Is there anything I can do to help troubleshoot this further ? I mean I've checked all the logs, half of them haven't even had any logs added in weeks (no errors), others have some benign warnings (such as not able to find the disabled directory from bb-xsnmp), but nothing showing issues except the history side.
Thanks,
Bill Hart
Computer Support Supervisor
Burke Corporation
Bill,
Please send your bb-hosts entries for those that have the downtime specified. Also, at the time this incident occurs (or when you discover it) and BEFORE you restart Hobbit, send the last several lines of the log file where you see the messages and a 'ls -l' of /var/log/hobbit (or whereever your log files live).
Send the recent messages from any log file that was updated on the current day (or the day the incident occurred).
Bill Hart wrote:
This is getting really annoying.
I added another DOWNTIME setting for a different machine.
This morning I came in early to find the system had died again.
The only entries I’m finding in the logs are the history.log that shows unable to access the historyfile for any of the systems. Restarting hobbit does the trick, and it always happens right around when an enable or disable is scheduled via downtime.
The error makes it seem like the files are just flat locked, I don’t know if there’s something in hobbit locking them all open or what.
Is there anything I can do to help troubleshoot this further ? I mean I’ve checked all the logs, half of them haven’t even had any logs added in weeks (no errors), others have some benign warnings (such as not able to find the disabled directory from bb-xsnmp), but nothing showing issues except the history side.
Thanks,
Bill Hart
Computer Support Supervisor
Burke Corporation
-- Rich Smrcina VM Assist, Inc. Phone: 414-491-6001 Ans Service: 360-715-2467 rich.smrcina at vmassist.com
Catch the WAVV! http://www.wavv.org WAVV 2007 - Green Bay, WI - May 18-22, 2007
Rich,
Here's what we're looking at :
Hosts: 10.1.1.101 innatrack # DOWNTIME=*:0300:0430 telnet ftp 172.16.1.24 CMD_Bagger_PLC #DOWNTIME=*:0230:0630 http://172.16.1.24
History.log. it gives the "cannot open" errors when it's down and the "will not update" after restarting the service:
Checking I just noticed this error at the same time as my last failures, however it is not with the previous set. 2007-04-06 05:26:09 Tried to down BOARDBUSY: Invalid argument . . . 2007-04-06 05:26:56 Cannot create histlog file '/home/hobbit/data/histlogs/MCC_Flex_I/O_Rack_2/http/Fri_Apr_6_05:26:56_ 2007' : No such fi le or directory 2007-04-06 05:26:56 Cannot open host logfile '/home/hobbit/data/hist/MCC_Flex_I/O_Rack_2' : No such file or directory 2007-04-06 05:26:56 Cannot open status historyfile '/home/hobbit/data/hist/Preblend_Flex_I/O.http' : No such file or directory 2007-04-06 05:26:56 Cannot create histlog file '/home/hobbit/data/histlogs/Preblend_Flex_I/O/http/Fri_Apr_6_05:26:56_20 07' : No such file or directory 2007-04-06 05:26:56 Cannot open host logfile '/home/hobbit/data/hist/Preblend_Flex_I/O' : No such file or directory 2007-04-06 05:26:56 Cannot open status historyfile '/home/hobbit/data/hist/Preblend_Point_I/O.http' : No such file or directory 2007-04-06 05:26:56 Cannot create histlog file '/home/hobbit/data/histlogs/Preblend_Point_I/O/http/Fri_Apr_6_05:26:56_2 007' : No such fil e or directory 2007-04-06 05:26:56 Cannot open host logfile '/home/hobbit/data/hist/Preblend_Point_I/O' : No such file or directory 2007-04-06 05:26:56 Cannot open status historyfile '/home/hobbit/data/hist/Relco_I/O_Rack.http' : No such file or directory 2007-04-06 05:26:56 Cannot create histlog file '/home/hobbit/data/histlogs/Relco_I/O_Rack/http/Fri_Apr_6_05:26:56_2007' : No such file or directory 2007-04-06 05:26:56 Cannot open host logfile '/home/hobbit/data/hist/Relco_I/O_Rack' : No such file or directory 2007-04-06 05:31:42 Will not update /home/hobbit/data/hist/voicemail.memory - color unchanged (green) 2007-04-06 05:31:46 Will not update /home/hobbit/data/hist/vscognos.uptime - color unchanged (green) 2007-04-06 05:32:04 Will not update /home/hobbit/data/hist/abra.uptime - color unchanged (green) 2007-04-06 05:32:45 Will not update /home/hobbit/data/hist/voicemail.procs - color unchanged (green) 2007-04-06 05:32:45 Will not update /home/hobbit/data/hist/voicemail.disk - color unchanged (green) 2007-04-06 05:33:06 Will not update /home/hobbit/data/hist/voicemail.msgs - color unchanged (green) 2007-04-06 05:33:10 Will not update /home/hobbit/data/hist/voicemail.netstat - color unchanged (green) 2007-04-06 05:33:12 Will not update /home/hobbit/data/hist/svetovit.uptime - color unchanged (green) 2007-04-06 05:33:40 Will not update /home/hobbit/data/hist/jdedeploy.uptime - color unchanged (green) 2007-04-06 05:33:57 Will not update /home/hobbit/data/hist/ems_burke.uptime - color unchanged (green) 2007-04-06 05:33:58 Will not update /home/hobbit/data/hist/kronos.uptime
- color unchanged (green) 2007-04-06 05:34:06 Will not update /home/hobbit/data/hist/burke-utility.uptime - color unchanged (green) 2007-04-06 05:35:36 Will not update /home/hobbit/data/hist/vs4100.uptime
- color unchanged (green) 2007-04-06 05:39:20 Will not update /home/hobbit/data/hist/voicemail.uptime - color unchanged (green)
Hobbitd.log only updates after restarting it and says simply setup complete
Page.log shows this : 2007-03-30 00:45:20 Tried to down BOARDBUSY: Invalid argument 2007-04-06 04:36:10 Tried to down BOARDBUSY: Invalid argument 2007-04-06 04:36:10 Worker process died with exit code 0, terminating 2007-04-06 05:26:09 Tried to down BOARDBUSY: Invalid argument
Rrd-data.log :
2007-04-06 03:12:37 Internal error: Duplicate match ignored 2007-04-06 04:36:10 Tried to down BOARDBUSY: Invalid argument 2007-04-06 05:26:09 Tried to down BOARDBUSY: Invalid argument
I did an ls -l before doing the restart and it was the same set of files with similar mod times as what's shown before the restart.
Bill Hart Burke Corporation -----Original Message----- From: Rich Smrcina [mailto:rsmrcina at wi.rr.com] Sent: Friday, April 06, 2007 9:12 AM To: hobbit at hswn.dk Subject: Re: [hobbit] More Downtime issues
Bill,
Please send your bb-hosts entries for those that have the downtime specified. Also, at the time this incident occurs (or when you discover
it) and BEFORE you restart Hobbit, send the last several lines of the log file where you see the messages and a 'ls -l' of /var/log/hobbit (or
whereever your log files live).
Send the recent messages from any log file that was updated on the current day (or the day the incident occurred).
Bill Hart wrote:
This is getting really annoying.
I added another DOWNTIME setting for a different machine.
This morning I came in early to find the system had died again.
The only entries I'm finding in the logs are the history.log that shows unable to access the historyfile for any of the systems. Restarting hobbit does the trick, and it always happens right around when an enable or disable is scheduled via downtime.
The error makes it seem like the files are just flat locked, I don't know if there's something in hobbit locking them all open or what.
Is there anything I can do to help troubleshoot this further ? I mean
I've checked all the logs, half of them haven't even had any logs added in weeks (no errors), others have some benign warnings (such as not able to find the disabled directory from bb-xsnmp), but nothing showing issues except the history side.
Thanks,
Bill Hart
Computer Support Supervisor
Burke Corporation
-- Rich Smrcina VM Assist, Inc. Phone: 414-491-6001 Ans Service: 360-715-2467 rich.smrcina at vmassist.com
Catch the WAVV! http://www.wavv.org WAVV 2007 - Green Bay, WI - May 18-22, 2007
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Fri, Apr 06, 2007 at 07:33:57AM -0500, Bill Hart wrote:
The only entries I'm finding in the logs are the history.log that shows unable to access the historyfile for any of the systems. Restarting hobbit does the trick, and it always happens right around when an enable or disable is scheduled via downtime.
Have you tried running an fsck on the filesystem where Hobbit lives?
I know it might seem a bit far-fetched, but Hobbit does stress some filesystems a lot due to the large number of small files it creates; it's happened to me a couple of times.
Regards, Henrik
participants (3)
-
bill.hart@burkecorp.com
-
henrik@hswn.dk
-
rsmrcina@wi.rr.com