Hi, I've written a quick addon to collect the temperature of each HDD in my system, and report the data back to Xymon like this:
Fri Aug 24 02:33:11 EST 2012 Temp GOOD (27 degrees)
sda : 27 sdb : 27
Also, hobbit is putting this data in to an RRD file called temp.rrd:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE rrd SYSTEM"http://oss.oetiker.ch/rrdtool/rrdtool.dtd"> <!-- Round Robin Database Dump --> <rrd> <version>0003</version> <step>300</step> <!-- Seconds --> <lastupdate>1345738019</lastupdate> <!-- 2012-08-24 02:06:59 EST -->
<ds>
<name> sda </name>
<type> GAUGE </type>
<minimal_heartbeat>600</minimal_heartbeat>
<min>NaN</min>
<max>NaN</max>
<!-- PDP Status -->
<last_ds>37</last_ds>
<value>4.4030000000e+03</value>
<unknown_sec> 0 </unknown_sec>
</ds>
<ds>
<name> sdb </name>
<type> GAUGE </type>
<minimal_heartbeat>600</minimal_heartbeat>
<min>NaN</min>
<max>NaN</max>
<!-- PDP Status -->
<last_ds>39</last_ds>
<value>4.6410000000e+03</value>
<unknown_sec> 0 </unknown_sec>
</ds>
So it looks (to me) as though all that is working.
I'm now trying to get the pretty graph to show up, here is my current definition which doesn't work: [temp] TITLE HDD Temperature YAXIS Degrees DEF:hda=temp.rrd:hda:AVERAGE DEF:hdb=temp.rrd:hdb:AVERAGE DEF:hdc=temp.rrd:hdc:AVERAGE DEF:hdd=temp.rrd:hdd:AVERAGE DEF:hde=temp.rrd:hde:AVERAGE DEF:sda=temp.rrd:sda:AVERAGE DEF:sdb=temp.rrd:sdb:AVERAGE DEF:sdc=temp.rrd:sdc:AVERAGE DEF:sdd=temp.rrd:sdd:AVERAGE DEF:sde=temp.rrd:sde:AVERAGE LINE2:hda#FF0000:hda LINE2:hdb#0000FF:hdb LINE2:hdc#00FF00:hdc LINE2:hdd#FFFF00:hdd LINE2:hde#FF00FF:hde COMMENT:\n GPRINT:hda:LAST:hda \: %5.1lf%s (cur) GPRINT:hda:MAX: \: %5.1lf%s (max) GPRINT:hda:MIN: \: %5.1lf%s (min) GPRINT:hda:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hdb:LAST:hdb \: %5.1lf%s (cur) GPRINT:hdb:MAX: \: %5.1lf%s (max) GPRINT:hdb:MIN: \: %5.1lf%s (min) GPRINT:hdb:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hdc:LAST:hdc \: %5.1lf%s (cur) GPRINT:hdc:MAX: \: %5.1lf%s (max) GPRINT:hdc:MIN: \: %5.1lf%s (min) GPRINT:hdc:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hdd:LAST:hdd \: %5.1lf%s (cur) GPRINT:hdd:MAX: \: %5.1lf%s (max) GPRINT:hdd:MIN: \: %5.1lf%s (min) GPRINT:hdd:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hde:LAST:hde \: %5.1lf%s (cur) GPRINT:hde:MAX: \: %5.1lf%s (max) GPRINT:hde:MIN: \: %5.1lf%s (min) GPRINT:hde:AVERAGE: \: %5.1lf%s (avg)\n
I think the problem is that not every machine will have the same number of disks, nor will they have the same type of disk. Can anyone suggest how to re-write this so that it will work?
Thanks, Adam
To handle variable numbers of disks with varying disk names, you need to end up with multiple RRD files, named (eg) temp,hda.rrd, temp,hdb.rrd, and so on.
To have xymond_rrd generate separate RRD files, you need to use SPLITNCV in xymonserver.cfg instead of NCV, like:
SPLITNCV_temp="*:gauge"
You'll need to restart Xymon, or at least kill off all of the xymond_rrd processes, and you should start getting multiple RRD files. Then you can use something like this in graphs.cfg:
[temp] TITLE HDD Temperature YAXIS Degrees FNPATTERN temp,(.*).rrd DEF:disk at RRDIDX@=@RRDFN@:lambda:AVERAGE LINE2:disk at RRDIDX@#@COLOR@:@RRDPARAM@ COMMENT:\n GPRINT:disk at RRDIDX@:LAST: \: %5.1lf%s (cur) GPRINT:disk at RRDIDX@:MAX: \: %5.1lf%s (max) GPRINT:disk at RRDIDX@:MIN: \: %5.1lf%s (min) GPRINT:disk at RRDIDX@:AVERAGE: \: %5.1lf%s (avg)\n
J
On 28 August 2012 08:47, Adam Goryachev <adam at websitemanagers.com.au> wrote:
Hi, I've written a quick addon to collect the temperature of each HDD in my system, and report the data back to Xymon like this:
Fri Aug 24 02:33:11 EST 2012 Temp GOOD (27 degrees)
sda : 27 sdb : 27
Also, hobbit is putting this data in to an RRD file called temp.rrd:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE rrd SYSTEM "http://oss.oetiker.ch/rrdtool/rrdtool.dtd" <http://oss.oetiker.ch/rrdtool/rrdtool.dtd>> <!-- Round Robin Database Dump --> <rrd> <version>0003</version> <step>300</step> <!-- Seconds --> <lastupdate>1345738019</lastupdate> <!-- 2012-08-24 02:06:59 EST -->
<ds> <name> sda </name> <type> GAUGE </type> <minimal_heartbeat>600</minimal_heartbeat> <min>NaN</min> <max>NaN</max> <!-- PDP Status --> <last_ds>37</last_ds> <value>4.4030000000e+03</value> <unknown_sec> 0 </unknown_sec> </ds> <ds> <name> sdb </name> <type> GAUGE </type> <minimal_heartbeat>600</minimal_heartbeat> <min>NaN</min> <max>NaN</max> <!-- PDP Status --> <last_ds>39</last_ds> <value>4.6410000000e+03</value> <unknown_sec> 0 </unknown_sec> </ds>So it looks (to me) as though all that is working.
I'm now trying to get the pretty graph to show up, here is my current definition which doesn't work: [temp] TITLE HDD Temperature YAXIS Degrees DEF:hda=temp.rrd:hda:AVERAGE DEF:hdb=temp.rrd:hdb:AVERAGE DEF:hdc=temp.rrd:hdc:AVERAGE DEF:hdd=temp.rrd:hdd:AVERAGE DEF:hde=temp.rrd:hde:AVERAGE DEF:sda=temp.rrd:sda:AVERAGE DEF:sdb=temp.rrd:sdb:AVERAGE DEF:sdc=temp.rrd:sdc:AVERAGE DEF:sdd=temp.rrd:sdd:AVERAGE DEF:sde=temp.rrd:sde:AVERAGE LINE2:hda#FF0000:hda LINE2:hdb#0000FF:hdb LINE2:hdc#00FF00:hdc LINE2:hdd#FFFF00:hdd LINE2:hde#FF00FF:hde COMMENT:\n GPRINT:hda:LAST:hda \: %5.1lf%s (cur) GPRINT:hda:MAX: \: %5.1lf%s (max) GPRINT:hda:MIN: \: %5.1lf%s (min) GPRINT:hda:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hdb:LAST:hdb \: %5.1lf%s (cur) GPRINT:hdb:MAX: \: %5.1lf%s (max) GPRINT:hdb:MIN: \: %5.1lf%s (min) GPRINT:hdb:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hdc:LAST:hdc \: %5.1lf%s (cur) GPRINT:hdc:MAX: \: %5.1lf%s (max) GPRINT:hdc:MIN: \: %5.1lf%s (min) GPRINT:hdc:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hdd:LAST:hdd \: %5.1lf%s (cur) GPRINT:hdd:MAX: \: %5.1lf%s (max) GPRINT:hdd:MIN: \: %5.1lf%s (min) GPRINT:hdd:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hde:LAST:hde \: %5.1lf%s (cur) GPRINT:hde:MAX: \: %5.1lf%s (max) GPRINT:hde:MIN: \: %5.1lf%s (min) GPRINT:hde:AVERAGE: \: %5.1lf%s (avg)\n
I think the problem is that not every machine will have the same number of disks, nor will they have the same type of disk. Can anyone suggest how to re-write this so that it will work?
Thanks, Adam
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Hello,
Hi, I've written a quick addon to collect the temperature of each HDD in my system, and report the data back to Xymon like this:
Fri Aug 24 02:33:11 EST 2012 Temp GOOD (27 degrees)sda : 27 sdb : 27
Also, hobbit is putting this data in to an RRD file called temp.rrd:
........
I think the problem is that not every machine will have the same number of disks, nor will they have the same type of disk. Can anyone suggest how to re-write this so that it will work?
An approach is not to monitor the temperatures of the individual disks, but to monitor the temperature of the group of disks on a single host. The rationale of this approach is that normally one is interested in the range of temperature values, combined with an alert if at least one temperature is out of range.
I've used this approach to monitor 'the temperature' of a switch. The number of temperature sensors per switch varies from 1 to 29 (!). The returned value is the minimum, average and maximum temperature, thus always three values per switch.
HTH, Wim Nelis.
The NLR disclaimer is valid for NLR e-mail messages.
This message is only meant for providing information. Nothing in this e-mail message amounts to a contractual or legal commitment on the part of the sender. This message may contain information that is not intended for you. If you are not the addressee or if this message was sent to you by mistake, you are requested to inform the sender and delete the message. Sender accepts no liability for damage of any kind resulting from the risks inherent in the electronic transmission of messages.
On 08/28/2012 03:39 PM, W.J.M. Nelis wrote:
Hello,
Hi, I've written a quick addon to collect the temperature of each HDD in my system, and report the data back to Xymon like this:
Fri Aug 24 02:33:11 EST 2012 Temp GOOD (27 degrees)sda : 27 sdb : 27
Also, hobbit is putting this data in to an RRD file called temp.rrd:
........ An approach is not to monitor the temperatures of the individual disks, but to monitor the temperature of the group of disks on a single host. The rationale of this approach is that normally one is interested in the range of temperature values, combined with an alert if at least one temperature is out of range.
I've used this approach to monitor 'the temperature' of a switch. The number of temperature sensors per switch varies from 1 to 29 (!). The returned value is the minimum, average and maximum temperature, thus always three values per switch.
I wanted to monitor each individual disk for the following reasons:
- To see if one drive is failing, maybe it will have a much higher temp than the others
- Since each drive is in a different position, it is nice to see the effect that this has on the individual drive temp
I can see why it would be pointless to get 29 different temp readings from a single switch, I can't imagine that the temp would vary much between all of those, and it isn't going to have individual parts that can be replaced like you can with individual drives in a server.
Regards, Adam
Hello,
I'm using multiple RRD (one per disk) and serial number as disk reference. The graph uses FNPATTERN to graph 'all files for a host'.
See my notes here http://www.tumfatig.net/20120426/monitor-synology-disk-temperature-from-snmp... The graph definition is: [snmp_disktemp] FNPATTERN ^snmp_disktemp.(.+).rrd TITLE Disk Temperature YAXIS Celcius -l 0 -E DEF:temp at RRDIDX@=@RRDFN@:temp:AVERAGE LINE:temp at RRDIDX@#@COLOR@:@RRDPARAM@ GPRINT:temp at RRDIDX@:LAST:%3.0lfC (cur) GPRINT:temp at RRDIDX@:MIN:%3.0lfC (min) GPRINT:temp at RRDIDX@:AVERAGE:%3.0lfC (avg) GPRINT:temp at RRDIDX@:MAX:%3.0lfC (max)\l
Regards, Jo
Le 28 août 2012 à 00:47, Adam Goryachev a écrit :
Hi, I've written a quick addon to collect the temperature of each HDD in my system, and report the data back to Xymon like this:
Fri Aug 24 02:33:11 EST 2012 Temp GOOD (27 degrees)
sda : 27 sdb : 27
Also, hobbit is putting this data in to an RRD file called temp.rrd:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE rrd SYSTEM "http://oss.oetiker.ch/rrdtool/rrdtool.dtd"
<!-- Round Robin Database Dump --> <rrd> <version>0003</version> <step>300</step> <!-- Seconds --> <lastupdate>1345738019</lastupdate> <!-- 2012-08-24 02:06:59 EST -->
<ds> <name> sda </name> <type> GAUGE </type> <minimal_heartbeat>600</minimal_heartbeat> <min>NaN</min> <max>NaN</max> <!-- PDP Status --> <last_ds>37</last_ds> <value>4.4030000000e+03</value> <unknown_sec> 0 </unknown_sec> </ds> <ds> <name> sdb </name> <type> GAUGE </type> <minimal_heartbeat>600</minimal_heartbeat> <min>NaN</min> <max>NaN</max> <!-- PDP Status --> <last_ds>39</last_ds> <value>4.6410000000e+03</value> <unknown_sec> 0 </unknown_sec> </ds>So it looks (to me) as though all that is working.
I'm now trying to get the pretty graph to show up, here is my current definition which doesn't work: [temp] TITLE HDD Temperature YAXIS Degrees DEF:hda=temp.rrd:hda:AVERAGE DEF:hdb=temp.rrd:hdb:AVERAGE DEF:hdc=temp.rrd:hdc:AVERAGE DEF:hdd=temp.rrd:hdd:AVERAGE DEF:hde=temp.rrd:hde:AVERAGE DEF:sda=temp.rrd:sda:AVERAGE DEF:sdb=temp.rrd:sdb:AVERAGE DEF:sdc=temp.rrd:sdc:AVERAGE DEF:sdd=temp.rrd:sdd:AVERAGE DEF:sde=temp.rrd:sde:AVERAGE LINE2:hda#FF0000:hda LINE2:hdb#0000FF:hdb LINE2:hdc#00FF00:hdc LINE2:hdd#FFFF00:hdd LINE2:hde#FF00FF:hde COMMENT:\n GPRINT:hda:LAST:hda \: %5.1lf%s (cur) GPRINT:hda:MAX: \: %5.1lf%s (max) GPRINT:hda:MIN: \: %5.1lf%s (min) GPRINT:hda:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hdb:LAST:hdb \: %5.1lf%s (cur) GPRINT:hdb:MAX: \: %5.1lf%s (max) GPRINT:hdb:MIN: \: %5.1lf%s (min) GPRINT:hdb:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hdc:LAST:hdc \: %5.1lf%s (cur) GPRINT:hdc:MAX: \: %5.1lf%s (max) GPRINT:hdc:MIN: \: %5.1lf%s (min) GPRINT:hdc:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hdd:LAST:hdd \: %5.1lf%s (cur) GPRINT:hdd:MAX: \: %5.1lf%s (max) GPRINT:hdd:MIN: \: %5.1lf%s (min) GPRINT:hdd:AVERAGE: \: %5.1lf%s (avg)\n GPRINT:hde:LAST:hde \: %5.1lf%s (cur) GPRINT:hde:MAX: \: %5.1lf%s (max) GPRINT:hde:MIN: \: %5.1lf%s (min) GPRINT:hde:AVERAGE: \: %5.1lf%s (avg)\n
I think the problem is that not every machine will have the same number of disks, nor will they have the same type of disk. Can anyone suggest how to re-write this so that it will work?
Thanks, Adam
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
participants (4)
-
adam@websitemanagers.com.au
-
jlaidman@rebel-it.com.au
-
joel@carnat.net
-
Wim.Nelis@nlr.nl