ganglia-style graph aggregation with hobbit
Hello,
I'd like to completely replace the ganglia system we have here with hobbit.
Most of the needed features are there, except the ability to produce aggregated graphs with multiple hosts, as in http://monitor.millennium.berkeley.edu/?c=PSI%20Cluster&m=&r=hour&s=descendi...
I suppose the following things will need to be done:
- find a neat way to group the hosts (all the hosts in a group or a page for instance)
- collect all the values from the rrd for all the hosts in a group and add them (when different from 0/NaN)
- graph the result
What would be the best way to do it? I'd like not to reinvent the wheel, so if some bits are already existing, I'd better use them...
Cheers, Gildas
On Wed, Sep 20, 2006 at 04:51:25PM +0100, Gildas Le Nadan wrote:
I'd like to completely replace the ganglia system we have here with hobbit.
Most of the needed features are there, except the ability to produce aggregated graphs with multiple hosts, as in http://monitor.millennium.berkeley.edu/?c=PSI%20Cluster&m=&r=hour&s=descendi...
I think it can be done just by putting together the right graph definitions. RRDtool which is used to generate the graphs has all of the necessary functions to build such aggregate graphs, and Hobbit already stores all of the information you're tracking. So it should "just" be a case of putting together the right input for the RRD graph module.
I have done it on an ad-hoc basis by hand-coding some extra graph definitions in the hobbitgraph.cfg file, but this is not suitable for the case where you have lots of hosts - for that you need something a bit more flexible that lets you select a group of hosts, and generate a graph with the type of aggregation you want.
What would be the best way to do it? I'd like not to reinvent the wheel, so if some bits are already existing, I'd better use them...
Check the hobbit_hostgraph.cgi module in Hobbit 4.2, and the "multi" definitions in hobbitgraph.cfg. The current hobbitgraph tool lets you generate a graph for multiple hosts, but just overlaid on top of each other. I think it might be possible to just modify the "multi" definitions in hobbitgraph.cfg to produce an aggregate graph also.
Regards, Henrik
Henrik Stoerner wrote:
On Wed, Sep 20, 2006 at 04:51:25PM +0100, Gildas Le Nadan wrote:
I'd like to completely replace the ganglia system we have here with hobbit.
Most of the needed features are there, except the ability to produce aggregated graphs with multiple hosts, as in http://monitor.millennium.berkeley.edu/?c=PSI%20Cluster&m=&r=hour&s=descendi...
I think it can be done just by putting together the right graph definitions. RRDtool which is used to generate the graphs has all of the necessary functions to build such aggregate graphs, and Hobbit already stores all of the information you're tracking. So it should "just" be a case of putting together the right input for the RRD graph module.
I have done it on an ad-hoc basis by hand-coding some extra graph definitions in the hobbitgraph.cfg file, but this is not suitable for the case where you have lots of hosts - for that you need something a bit more flexible that lets you select a group of hosts, and generate a graph with the type of aggregation you want.
What would be the best way to do it? I'd like not to reinvent the wheel, so if some bits are already existing, I'd better use them...
Check the hobbit_hostgraph.cgi module in Hobbit 4.2, and the "multi" definitions in hobbitgraph.cfg. The current hobbitgraph tool lets you generate a graph for multiple hosts, but just overlaid on top of each other. I think it might be possible to just modify the "multi" definitions in hobbitgraph.cfg to produce an aggregate graph also.
Regards, Henrik
Hum, I'm afraid I don't get how it works/can't make it work on a simple example: I'm trying to change la-multi in hobbitgraph.cfg so the values will be added up instead of printed on top of the others.
Are the entries in hobbitgraph.cfg used as a template to build the rrdgrph query? If so, then how can I access the values from the previous RDN to add them to the one in the current RDN (@RRDFN@)?
I tried adding the values to a VDEF:add=add, at RRDIDX@,+ but without success.
Any clue?
Cheers, Gildas
Gildas Le Nadan wrote:
Hum, I'm afraid I don't get how it works/can't make it work on a simple example: I'm trying to change la-multi in hobbitgraph.cfg so the values will be added up instead of printed on top of the others.
Are the entries in hobbitgraph.cfg used as a template to build the rrdgrph query? If so, then how can I access the values from the previous RDN to add them to the one in the current RDN (@RRDFN@)?
I tried adding the values to a VDEF:add=add, at RRDIDX@,+ but without success.
Did you ever work out a solution for this?
I'm starting to investigate a way that I can take data from many rrd files, and graph the average of the data all those rrd files as a single line. For example, I'd like to average the %CPU usage (la1) for 10 different webservers, and display it as a single overall average %CPU for a web farm.
Tom
Hi,
Tom Georgoulias a écrit :
I'm starting to investigate a way that I can take data from many rrd files, and graph the average of the data all those rrd files as a single line. For example, I'd like to average the %CPU usage (la1) for 10 different webservers, and display it as a single overall average %CPU for a web farm.
I currently am writing an extension for hobbit that works almost like bbcombotest, but it lets you yield a green, yellow or red (bbcombotest only has green and red). It will also sum and average NCV-like data out of aggregated statuses. I intend to use it for load-balanced pools, and limited resources, such as X25 network lines. It hope to release it by the end of the week. If it works like I want, I'll then try to ack it into the hobbit core.
-- Charles Goyard - cgoyard at cvf.fr - (+33) 1 45 38 01 31
Tom Georgoulias wrote:
Gildas Le Nadan wrote:
Hum, I'm afraid I don't get how it works/can't make it work on a simple example: I'm trying to change la-multi in hobbitgraph.cfg so the values will be added up instead of printed on top of the others.
Are the entries in hobbitgraph.cfg used as a template to build the rrdgrph query? If so, then how can I access the values from the previous RDN to add them to the one in the current RDN (@RRDFN@)?
I tried adding the values to a VDEF:add=add, at RRDIDX@,+ but without success.
Did you ever work out a solution for this?
No, not yet. I tried several other things in the [*-multi] hobbitgraph.cfg definitions but with no luck so far (I am no a rrd expert).
Btw Henrik, I also think it would be a good idea if the multi graph menu in hobbit-hostgraphs.sh was generated automatically from the [*-multi] entries in hobbitgraph.cfg.
Things I tried so far:
the :STACK option don't work, probably because it shouldn't be added for the first entry (I tried the example on the rrd page, setting a graph with a constant value as a first graph don't work)
I tried a VDEF/CDEF with IF so if there is no entry we set it to 0 (because we have to treat the first entry correctly)
I was about to test the different possibilities using rrdgraph straight instead of hobbit graph, so to get more debug/output when it fails.
Then, when I'll get a working solution, I'll try to see if this is possible to implement using the actual hobbitgraph.cgi. If not, I'll try to patch/ask Henrik for features.
(At least that's my plan)
I'm starting to investigate a way that I can take data from many rrd files, and graph the average of the data all those rrd files as a single line. For example, I'd like to average the %CPU usage (la1) for 10 different webservers, and display it as a single overall average %CPU for a web farm.
Tom
Yes, this is a fairly common problem I think :) There's plenty of other usage, such as adding up the bandwidth on different servers, and so on...
Cheers, Gildas
Hi all,
I'm trying to get a setup where I am E-mailed when someone ack's or disables the alert with the reason they gave, the problem I am having is that all alerts are being sent to me, hobbit-alerts.cfg I have put:
HOST=%.* MAIL me NOTICE
Do I need to change this? Also can you get ack's to send an e-mail to a specified person that they have been ack'd? if not can this be considered a feature request please :)
Thanks, Jason.
Note that as far as I know, currently only enable/disables are sent via NOTICE, and there are no alerts for Acks.
-Charles
Jones, Jason (Altrincham) wrote:
Hi all,
I'm trying to get a setup where I am E-mailed when someone ack's or disables the alert with the reason they gave, the problem I am having is that all alerts are being sent to me, hobbit-alerts.cfg I have put:
HOST=%.* MAIL me NOTICE
Do I need to change this? Also can you get ack's to send an e-mail to a specified person that they have been ack'd? if not can this be considered a feature request please :)
Thanks, Jason.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Ignore this, I resent it when it wouldn't appear on the list the first time and it seems that it took this long to appear...maybe a filter or something? Jason. -----Original Message----- From: Jones, Jason (Altrincham) Sent: 11 October 2006 10:46 To: hobbit at hswn.dk Subject: [hobbit] NOTICE tag
Hi all,
I'm trying to get a setup where I am E-mailed when someone ack's or disables the alert with the reason they gave, the problem I am having is that all alerts are being sent to me, hobbit-alerts.cfg I have put:
HOST=%.* MAIL me NOTICE
Do I need to change this? Also can you get ack's to send an e-mail to a specified person that they have been ack'd? if not can this be considered a feature request please :)
Thanks, Jason.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Hi all,
I'm trying to get a setup where I am E-mailed when someone ack's or disables an alert with the reason they gave, the problem I am having is that all alerts are being sent to me, in hobbit-alerts.cfg I have put:
HOST=%.* MAIL me NOTICE
Do I need to change this? Also can you get ack's to send an e-mail to a specified person that they have been ack'd? if not can this be considered a feature request please :)
Thanks, Jason.
Hi, BEWARE: this patch has been tested on a machine with rrdtool 1.0.x. The values/behavior for rrdtool 1.2.x were taken from the online rrd documentation so they are hopefully correct. If someone was able to test it for me on a server with rrdtool 1.2, I would be very grateful! Cheers, Gildas -- The following patch add support for a @STACKIT@ keyword in the graph definitions in hobbitgraph.cfg, allowing data to be stacked. The STACK behavior changed between rrdtool 1.0.x and 1.2.x, hence the ifdef: - in 1.0.x, you replace the graph type (AREA|LINE) for the graph you want to stack with the STACK keyword - in 1.2.x, you add the STACK keyword at the end of the definition Please note that in both cases the first entry mustn't contain the keyword STACK at all, so we need a different treatment for the first rrdidx examples of valid hobbitgraph.cfg entries: rrdtool 1.0.x [la-multi] TITLE Multi-host CPU Load YAXIS Load FNPATTERN la.rrd DEF:avg at RRDIDX@=@RRDFN@:la:AVERAGE CDEF:la at RRDIDX@=avg at RRDIDX@,100,/ @STACKIT@:la at RRDIDX@#@COLOR@:@RRDPARAM@ -u 1.0 GPRINT:la at RRDIDX@:LAST: \: %5.1lf (cur) GPRINT:la at RRDIDX@:MAX: \: %5.1lf (max) GPRINT:la at RRDIDX@:MIN: \: %5.1lf (min) GPRINT:la at RRDIDX@:AVERAGE: \: %5.1lf (avg)\n rrdtool 1.2.x [la-multi] TITLE Multi-host CPU Load YAXIS Load FNPATTERN la.rrd DEF:avg at RRDIDX@=@RRDFN@:la:AVERAGE CDEF:la at RRDIDX@=avg at RRDIDX@,100,/ AREA:la at RRDIDX@#@COLOR@:@RRDPARAM@:@STACKIT@ -u 1.0 GPRINT:la at RRDIDX@:LAST: \: %5.1lf (cur) GPRINT:la at RRDIDX@:MAX: \: %5.1lf (max) GPRINT:la at RRDIDX@:MIN: \: %5.1lf (min) GPRINT:la at RRDIDX@:AVERAGE: \: %5.1lf (avg)\n --- hobbit-4.2.0/web/hobbitgraph.c 2006-08-09 21:10:13.000000000 +0100 +++ hobbit-4.2.0.ganglia/web/hobbitgraph.c 2006-10-12 10:24:09.788773551 +0100 @@ -392,6 +392,42 @@ } inp += 10; } + else if (strncmp(inp, "@STACKIT@", 9) == 0) { + /* the STACK behavior changed between rrdtool 1.0.x + * and 1.2.x, hence the ifdef: + * - in 1.0.x, you replace the graph type (AREA|LINE) + * for the graph you want to stack with the STACK + * keyword + * - in 1.2.x, you add the STACK keyword at the end + * of the definition + * + * Please note that in both cases the first entry + * mustn't contain the keyword STACK at all, so + * we need a different treatment for the first rrdidx + * + * examples of hobbitgraph.cfg entries: + * + * - rrdtool 1.0.x + * @STACKIT@:la at RRDIDX@#@COLOR@:@RRDPARAM@ + * + * - rrdtool 1.2.x + * AREA::la at RRDIDX@#@COLOR@:@RRDPARAM@:@STACKIT@ + */ + char numstr[10]; + if (rrdidx == 0) { +#ifdef RRDTOOL12 + sprintf(numstr, ""); +#else + sprintf(numstr, "AREA"); +#endif + } + else { + sprintf(numstr, "STACK"); + } + strcpy(outp, numstr); + outp += strlen(outp); + inp += 9; + } else if (strncmp(inp, "@RRDIDX@", 8) == 0) { char numstr[10];
On Thursday 12 October 2006 11:33, Gildas Le Nadan wrote:
Hi,
BEWARE: this patch has been tested on a machine with rrdtool 1.0.x. The values/behavior for rrdtool 1.2.x were taken from the online rrd documentation so they are hopefully correct. If someone was able to test it for me on a server with rrdtool 1.2, I would be very grateful!
Seems to work as expected, with :
$ ldd /usr/lib/hobbit/server/bin/hobbitgraph.cgi |grep rrd librrd.so.2 => /usr/lib/librrd.so.2 (0x007c2000)
$ rpm -qf /usr/lib/librrd.so.2 librrdtool2-1.2.11-2.rhel4es
Only thing is, it would be nice to have both the multi-host graphs and the aggregated ones available. But, that is more of a hobbit-only issue (along with multiple graphs on the page for one custom/extension test, etc. etc.).
Regards, Buchan
-- Buchan Milne ISP Systems Specialist - Monitoring/Authentication Team Leader B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)
Buchan Milne wrote:
On Thursday 12 October 2006 11:33, Gildas Le Nadan wrote:
Hi,
BEWARE: this patch has been tested on a machine with rrdtool 1.0.x. The values/behavior for rrdtool 1.2.x were taken from the online rrd documentation so they are hopefully correct. If someone was able to test it for me on a server with rrdtool 1.2, I would be very grateful!
Seems to work as expected, with :
$ ldd /usr/lib/hobbit/server/bin/hobbitgraph.cgi |grep rrd librrd.so.2 => /usr/lib/librrd.so.2 (0x007c2000)
$ rpm -qf /usr/lib/librrd.so.2 librrdtool2-1.2.11-2.rhel4es
Thanks very much for your help!
Only thing is, it would be nice to have both the multi-host graphs and the aggregated ones available. But, that is more of a hobbit-only issue (along with multiple graphs on the page for one custom/extension test, etc. etc.).
Well, this is exactly what I intend to do in the next step :)
This is why I think we need the multi-host graph list produced by hobbit-hostgraphs.cgi to be automatically generated from the [*-multi] entries in hobbitgraph.cfg. It would then allow to add [aggrla-multi] entries for instance.
Regards, Buchan
Thanks for your help, much appreciated, Gildas
Gildas Le Nadan wrote:
BEWARE: this patch has been tested on a machine with rrdtool 1.0.x. The values/behavior for rrdtool 1.2.x were taken from the online rrd documentation so they are hopefully correct. If someone was able to test it for me on a server with rrdtool 1.2, I would be very grateful!
I'm a little late on testing this, so hopefully this patch is still the latest version.
I have tested the patch with Hobbit 4.2.0 + all-in-one patch and rrdtool 1.2.15. It seems to work fine, although I am not seeing data presented in the way that I had expected it to look. I want to use the RRD stack to store values taken from the rrd for each host in a given group, then get an average of those stored values which is a single data point that represents the host group as a whole.
I've been experimenting with a version of the la1-multi definition, but I haven't gotten anything to work yet and I'm rather certain that my syntax is off in a few places. I thought I'd email it out anyway, in case I can get some pointers or have someone let me know that it won't work.
TITLE Farm CPU Utilitization YAXIS % Used FNPATTERN vmstat.rrd -u 100 -r DEF:cpu_idl at RRDIDX@=@RRDFN@:cpu_idl:AVERAGE CDEF:hostcpu at RRDIDX@=100,cpu_idl at RRDIDX@,-
need a way to push each hostcpu at RRDIDX@ onto the stack
CDEF:cpuavgs=hostcpu at RRDIDX@,AVERAGE
then get an average of all the values on the stack
using something like COUNT as the num of items in the stack
CDEF:pbusy=cpuavgs1,cpuavgs2,cpuavgs3,COUNT,AVG
graph the final data point
LINE2:pbusy#ccccff:%CPU GPRINT:pbusy:LAST: \: %5.1lf (cur) GPRINT:pbusy:MAX: \: %5.1lf (max) GPRINT:pbusy:MIN: \: %5.1lf (min) GPRINT:pbusy:AVERAGE: \: %5.1lf (avg)\n
-- Tom Georgoulias Systems Engineer McClatchy Interactive
Hello,
As far as I understand, you want to add up all the values for the hosts in your group and display a single graph instead of a stack of multiple graphs.
The @STACKIT@ stanza was not designed for that, but is designed to graph the relative contribution of each host in the resulting graph.
For your need, I see 2 different options: 1- You create an external script that store the value in a separate rrd and graph it 2- You do the calculations each time you do the rendering, using rrdtool 1.2. There seems to be ways in rrdtool 1.2 to do that, but it seem that you'll have to use the IF operator to populate your value for the first entry. I've seen examples somewhere but I don't remember where.
In case 2, I strongly recommend that you do all the tests manually (i-e not by modifying hobbitgraph.conf) as it is far easier to figure out what the problem is. Once you have a working solution, you can figure out how to adapt it to hobbit.
Cheers, Gildas
Tom Georgoulias wrote:
Gildas Le Nadan wrote:
BEWARE: this patch has been tested on a machine with rrdtool 1.0.x. The values/behavior for rrdtool 1.2.x were taken from the online rrd documentation so they are hopefully correct. If someone was able to test it for me on a server with rrdtool 1.2, I would be very grateful!
I'm a little late on testing this, so hopefully this patch is still the latest version.
I have tested the patch with Hobbit 4.2.0 + all-in-one patch and rrdtool 1.2.15. It seems to work fine, although I am not seeing data presented in the way that I had expected it to look. I want to use the RRD stack to store values taken from the rrd for each host in a given group, then get an average of those stored values which is a single data point that represents the host group as a whole.
I've been experimenting with a version of the la1-multi definition, but I haven't gotten anything to work yet and I'm rather certain that my syntax is off in a few places. I thought I'd email it out anyway, in case I can get some pointers or have someone let me know that it won't work.
TITLE Farm CPU Utilitization YAXIS % Used FNPATTERN vmstat.rrd -u 100 -r DEF:cpu_idl at RRDIDX@=@RRDFN@:cpu_idl:AVERAGE CDEF:hostcpu at RRDIDX@=100,cpu_idl at RRDIDX@,-
need a way to push each hostcpu at RRDIDX@ onto the stack
CDEF:cpuavgs=hostcpu at RRDIDX@,AVERAGE
then get an average of all the values on the stack
using something like COUNT as the num of items in the stack
CDEF:pbusy=cpuavgs1,cpuavgs2,cpuavgs3,COUNT,AVG
graph the final data point
LINE2:pbusy#ccccff:%CPU GPRINT:pbusy:LAST: \: %5.1lf (cur) GPRINT:pbusy:MAX: \: %5.1lf (max) GPRINT:pbusy:MIN: \: %5.1lf (min) GPRINT:pbusy:AVERAGE: \: %5.1lf (avg)\n
participants (7)
-
bgmilne@staff.telkomsa.net
-
cgoyard@cvf.fr
-
gn1@sanger.ac.uk
-
henrik@hswn.dk
-
JasonAS_Jones@mentor.com
-
jonescr@cisco.com
-
tomg@mcclatchyinteractive.com