On Fri, Jan 19, 2007 at 10:55:05AM +0100, Charles Goyard wrote:
Gildas Le Nadan wrote :
The TRACKMAX feature is really interesting as the max values get "diluted" in the week/month/year views.
Alas, it seem to only work for NCV values at the moment (and I wish I could use it for cpu/memory/network/disk).
Charles, Henrik, do you think this is feasible?
I agree it would be nice.
It would. Only problem is that it would increase the size of the RRD files (perhaps not a big issue), and it would only have effect on new RRD files that are created, not any existing ones. So to add this to an existing setup, you would have to dump all of your RRD files, then create new ones, and import the data from the dump. Not sure how much work would be involved in automating this.
However, there has also been some requests to increase the granularity of the stored data, e.g. to keep 7 days worth of 5-minute averages as opposed to the current 2 days. And likewise for the other RRA's.
(For those not familiar with how RRDtool works, the RRA's define how each measurement gets averaged over time. Hobbit uses 4 RRA's in each RRD file - one RRA tracks data (almost) without averaging it, using the 5-minute interval; the next averages across 6 measurements
- 30 minutes; the third averages across 24 measurements - 2 hours; the last averages across 288 measurements - 1 day. For each of these 4 groups of data-averages, the RRD file contains 576 values. So the first RRA covers 576*5 minutes=2 days, the second covers 12 days, the third 48 days and the last 576 days. When Hobbit generates a graph from the RRD data, RRDtool will automatically grab the data for the graph from the best group of data which has data for the period requested).
So if we're going to add MAX/MIN tracking to the RRD-files, we might as well do it at the same time that we change the granularity. The numbers I've been thinking of are to keep
- 30 days of 5-minute averages
- 90 days of 15-minute averages
- 360 days on 1-hour averages
- 1080 days of 3-hour averages
That alone would cause the size of the RRD files to increase 15 times. Adding MIN+MAX tracking would mean tripling the size. An "average" host in my setup uses ~400 KB of diskspace for RRD-files, so increasing that 15x3 times means it would grow to ~16 MB per host. It's a significant increase (I'd have to get more disk space for my production systems to handle that), but disks are getting bigger and cheaper - and I'd still be storing data for 4000 hosts on less than 60 GB.
An simple RRD file would grow from ~19 KB to 855 KB.
I don't think there would be much of a performance hit from this. The RRD update spends most of its time opening and locking the file, whereas the actual data-update doesn't take long.
I have a question for Henrik prior to implement it. The RRAS are added individually in each /rrd/*.c backend, and they get calculated for every status message coming in
Actually, they don't. All of the rrd/*.c files use logic like this:
static char *la_params[] = { "rrdcreate", rrdfn, "DS:la:GAUGE:600:0:U", rra1, rra2, rra3, rra4, NULL }; static char *la_tpl = NULL;
if (la_tpl == NULL) la_tpl = setup_template(la_params);
The keyword here is the declaration of the variables as "static".
The "la_params[]" is a static table, so this is initialized at compile-time with the static values. "la_tpl" is the RRD "template", which basically is a list of the dataset names in the order which Hobbit is feeding data values in. This template is only calculated the first time this type of RRD file is updated - that's what the "if (la_tpl == NULL) ... " does.
Inside the create_and_update_rrd() function, the "la_params" with the RRA's are only used when creating a new RRD.
Regards, Henrik