Hello All,
I've written my own "sar" test to put sar cpu data into the database. It's been working fine now for a while. I don't do any alerting, it's just graphing sar data.
However, a recent problem came up. Hobbit uses cpu load for alerting on high cpu load. However, using cpu load doesn't take into account i/o wait problems. The cpu may have a low load on it, but have excessive i/o wait, so as an example, a 24 cpu box, with a load of 12 on it may actually be at 99% utilization.
I've seen this on database servers where someone writes a bad sql query to access the database, or an index file isn't there. So I want to alert on cpu utilization in addition to cpu load.
I want to take my sar test, output is below, that I've written and have it check for idle time less than 5% for alerting. I've already modified my test for this.
However, when I was orignally writing the sar test to put into the rrd database, if I included any kind of status message, it would not put data into the database.
What's the format so that it keeps putting data into the database even if I have a status message? or am I wrong on this?
Example:
Here's the output of my sar test now:
Thu Jul 10 10:35:20 CDT 2008 usr : 3 sys : 1 wio : 1 idle : 95
**** GRAPH *****
What I want is something like this if idle is less than 5%:
Thu Jul 10 10:40:30 CDT 2008
- CPU usage is above 95% user : 40 sys : 10 wio : 47 idle : 3
*** GRAPH ****
So, if I add to the ouput the CPU usage is above 95% will I continue to write to the database the stats? Last time I tried adding status messages like above, it didn't seem to put anything in the database.
THanks....James