Hello All,
I've written my own "sar" test to put sar cpu data into the database. It's been working fine now for a while. I don't do any alerting, it's just graphing sar data.
However, a recent problem came up. Hobbit uses cpu load for alerting on high cpu load. However, using cpu load doesn't take into account i/o wait problems. The cpu may have a low load on it, but have excessive i/o wait, so as an example, a 24 cpu box, with a load of 12 on it may actually be at 99% utilization.
I've seen this on database servers where someone writes a bad sql query to access the database, or an index file isn't there. So I want to alert on cpu utilization in addition to cpu load.
I want to take my sar test, output is below, that I've written and have it check for idle time less than 5% for alerting. I've already modified my test for this.
However, when I was orignally writing the sar test to put into the rrd database, if I included any kind of status message, it would not put data into the database.
What's the format so that it keeps putting data into the database even if I have a status message? or am I wrong on this?
Example:
Here's the output of my sar test now:
Thu Jul 10 10:35:20 CDT 2008 usr : 3 sys : 1 wio : 1 idle : 95
**** GRAPH *****
What I want is something like this if idle is less than 5%:
Thu Jul 10 10:40:30 CDT 2008
- CPU usage is above 95% user : 40 sys : 10 wio : 47 idle : 3
*** GRAPH ****
So, if I add to the ouput the CPU usage is above 95% will I continue to write to the database the stats? Last time I tried adding status messages like above, it didn't seem to put anything in the database.
THanks....James
If the test turns red, it stops logging data even with no status message.
Henrik, if I have my own test using NCV, and the test goes red, is there a reason it would stop putting data current NCV data in the RRD database?
Any help would be appreciated, or is there a way that hobbit can alert on cpu usage instead of cpu load (or both).
Thanks....James
From: James Wade [mailto:jkwade at futurefrontiers.com] Sent: Thursday, July 10, 2008 10:45 AM To: hobbit at hswn.dk Subject: [hobbit] Help writing test -- sar
Hello All,
I've written my own "sar" test to put sar cpu data into the database. It's been working fine now for a while. I don't do any alerting, it's just graphing sar data.
However, a recent problem came up. Hobbit uses cpu load for alerting on high cpu load. However, using cpu load doesn't take into account i/o wait problems. The cpu may have a low load on it, but have excessive i/o wait, so as an example, a 24 cpu box, with a load of 12 on it may actually be at 99% utilization.
I've seen this on database servers where someone writes a bad sql query to access the database, or an index file isn't there. So I want to alert on cpu utilization in addition to cpu load.
I want to take my sar test, output is below, that I've written and have it check for idle time less than 5% for alerting. I've already modified my test for this.
However, when I was orignally writing the sar test to put into the rrd database, if I included any kind of status message, it would not put data into the database.
What's the format so that it keeps putting data into the database even if I have a status message? or am I wrong on this?
Example:
Here's the output of my sar test now:
Thu Jul 10 10:35:20 CDT 2008 usr : 3 sys : 1 wio : 1 idle : 95
**** GRAPH *****
What I want is something like this if idle is less than 5%:
Thu Jul 10 10:40:30 CDT 2008
- CPU usage is above 95% user : 40 sys : 10 wio : 47 idle : 3
*** GRAPH ****
So, if I add to the ouput the CPU usage is above 95% will I continue to write to the database the stats? Last time I tried adding status messages like above, it didn't seem to put anything in the database.
THanks....James
I' ve done this before, although I tend to do the custom extra-script solution anymore, since it is more powerful. However, I've never had a problem with a status message or particular color stopping graphing from working. Only obvious idea at the moment is, do you in any way change the line structure of the reported values, in such a way that the lines would not parse correctly after adding the status message? -Alan
James Wade wrote:
If the test turns red, it stops logging data even with no status message.
Henrik, if I have my own test using NCV, and the test goes red, is there a reason it would stop putting data current NCV data in the RRD database?
Any help would be appreciated, or is there a way that hobbit can alert on cpu usage instead of cpu load (or both).
Thanks....James
*From:* James Wade [mailto:jkwade at futurefrontiers.com] *Sent:* Thursday, July 10, 2008 10:45 AM *To:* hobbit at hswn.dk *Subject:* [hobbit] Help writing test -- sar
Hello All,
I've written my own "sar" test to put sar cpu data into the database. It's been working fine now for a while. I don't do any alerting, it's just graphing sar data.
However, a recent problem came up. Hobbit uses cpu load for alerting on high cpu load. However, using cpu load doesn't take into account i/o wait problems. The cpu may have a low load on it, but have excessive i/o wait, so as an example, a 24 cpu box, with a load of 12 on it may actually be at 99% utilization.
I've seen this on database servers where someone writes a bad sql query to access the database, or an index file isn't there. So I want to alert on cpu utilization in addition to cpu load.
I want to take my sar test, output is below, that I've written and have it check for idle time less than 5% for alerting. I've already modified my test for this.
However, when I was orignally writing the sar test to put into the rrd database, if I included any kind of status message, it would not put data into the database.
What's the format so that it keeps putting data into the database even if I have a status message? or am I wrong on this?
Example:
Here's the output of my sar test now:
Thu Jul 10 10:35:20 CDT 2008 usr : 3 sys : 1 wio : 1 idle : 95
**** GRAPH *****
What I want is something like this if idle is less than 5%:
Thu Jul 10 10:40:30 CDT 2008
- CPU usage is above 95% user : 40 sys : 10 wio : 47 idle : 3
*** GRAPH ****
So, if I add to the ouput the CPU usage is above 95% will I continue to write to the database the stats? Last time I tried adding status messages like above, it didn't seem to put anything in the database.
THanks....James
On Thursday 10 July 2008 17:44:52 James Wade wrote:
Hello All,
I've written my own "sar" test to put sar cpu data into the database. It's been working fine now for a while. I don't do any alerting, it's just graphing sar data.
But, the vmstat graphs already graph cpu utilisation, and AFAIK Henrik has added alerting based on the vmstat data for cpu utilisation in the current development version.
Regards, Buchan
participants (3)
-
asparks@doublesparks.net
-
bgmilne@staff.telkomsa.net
-
jkwade@futurefrontiers.com