Did a grep 2012-07-18 on the logs and excluding rrd-* this is all I have.
These lines are because a DNS server I monitor is down. I disabled it when I discovered the campus lost power.
bb-network.log: 2012-06-17 21:16:24 WARNING: Runtime 581 longer than time limit (300)
This would be the right time:
clientdata.log 2012-07-18 14:19:39 Tried to down BOARDBUSY: Invalid argument
history.log 2012-07-18 14:19:39 Tried to down BOARDBUSY: Invalid argument 2012-07-18 14:27:36 Will not update /home/hobbituser/data/hist/foohostname,imaginenetworksllc,com.bbd - color unchanged (green) #last line repeated for every host that experience this problem
hobbitd.log 2012-07-18 14:19:49 Setup complete
page.log 2012-07-18 14:19:39 Tried to down BOARDBUSY: Invalid argument
Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
On Wed, Jul 18, 2012 at 3:44 PM, <cleaver at terabithia.org> wrote:
I haven't seen this in the last year or more on this server. I had sporadic issues on another service, but by simply moving hardware (from dedicated Atom to a ESXi platform) it was resolved.
The page said it was red for 2-6 minutes. I knew the test happens every 5, so I would have expected a retest to clear it (hosts were ping responsive from the shell).
What log are you referring to?
hobbitnet (or bbnet, I forget the process name in 4.2.3)'s output log. Also hobbitlaunch.log from around the time, just to see if something abnormally quit.
-jc
On Wed, Jul 18, 2012 at 3:23 PM, <cleaver at terabithia.org> wrote:
I have a front page with about a dozen hosts and then sub pages. Every CONN test on the front page failed. Each and every host on the subpages (well over a dozen) was just fine. After 6 minutes I restarted the hobbitd processes. They all came right back.
I am running 4.2.3. Using fping to check
- hobbitserver.cfg:FPING="/usr/sbin/fping"
Has anyone seen this?
Hmm. It's possible that hobbitnet (?) died or was hung up... Or that the pages weren't representative of the same run (eg, bbgen could have died during its generation).
Questions: Do you recall the page timestamps being the same? If you clicked through to the tests when it was happening, did the (dynamic) test page match the (static) color in the grid? Has the problem started recently, is it repeating, and was there anything interesting in the logs at the time?
-jc