On Sat, Jul 21, 2007 at 09:34:11PM -0400, Scott Walters wrote:
Great to see the summary, these features look great. I'd like to request more RRDs and reports about the monitoring system and the servers/services monitored. For example:
I think the following could be "gauge" metrics:
Number of devices monitored Number of services monitored Number of host.service in green state Number of host.service in yellow state Number of host.service in red state Number of host.service in XXX state
You mean like this:
Statistics:
Hosts : 4321
Pages : 286
Status messages : 22331
- Red : 907 ( 4.06 %)
- Red (non-propagating) : 809 ( 3.62 %)
- Yellow : 353 ( 1.58 %)
- Yellow (non-propagating) : 210 ( 0.94 %)
- Clear : 1970 ( 8.82 %)
- Green : 17052 (76.36 %)
- Purple : 452 ( 2.02 %)
- Blue : 578 ( 2.59 %)
The first three are from the current "bbgen --report" status message; I've added the breakdown of the colors now. Will put these into an RRD for tracking trends.
I am thinking these could be done by creating counters within hobbit (since boot):
Number of state changes Number of state changes per server Number of state changes per service Number of notifications sent
The state changes can be calculated from the history logs. This is preferable, I think, because that way it won't get reset if the Hobbit server is restarted.
Notifications - it would make sense to have the alert module provide some statistics that we could put into a trend graph.
If you like, I could draft up some graphs and reports I'd like to see. My above description might be hard to visualize. I definitely think hobbit could benefit from internal counters, similarly to how on OS keeps tracks of context switches and the like.
Please do. The graphs I've created about the Hobbit "internals" have been mostly for my own use as debugging / performance evaluation data. If we can provide some data that is interesting to management, that would be a good thing.
Regards, Henrik