[hobbit] some larrd issues on hobbit 4.0.3 rc1
These vmstat rrds are from back on larrd 42; just after the change to accumulate cpu wait. So I'm trying the vmstat recreate to see if the definitions I've got are severely non-standard (I'd almost bet on it)
If so, I'll see what I need to hack to keep my current data . . .
On the disk space rrds -- this is a lot of wasted activity for us; we have about 8 filesystems we care about, and my production R3 DB server currently has 95 filesystems that have been 100% full since creation -- and we add another 10 every 13 months (150 GB -- SAP just *eats* disk).
If I can't suppress the creation, I'd at least like to suppress the display of the graphs -- they're meaningless noise in our shop.
Not a showstopper for going live Monday, though -- If I can figure out the vmstat issue.
Tom Kauffman NIBCO
-----Original Message----- From: Henrik Stoerner [mailto:henrik at hswn.dk] Sent: Friday, April 29, 2005 4:41 PM To: hobbit at hswn.dk Subject: Re: [hobbit] some larrd issues on hobbit 4.0.3 rc1
On Fri, Apr 29, 2005 at 01:27:49PM -0500, Kauffman, Tom wrote:
OK -- I chickened out and haven't thrown the Big Red Switch yet -- but I have hobbit running (display functions only) on my failover system and most everything looks good -- except (there's that ugly word) my vmstat graphs for my AIX systems.
Something is quite wrong, and I'm not sure what to look at.
Here's what the vmstat bottom feeder ships out:
aix 1 3 2342710 511 0 1 1 2249 14879 0 1964 11870 4215 15 11 31 42
so cpu_usr is 15, cpu_sys is 11, cpu_idl is 31, and cpu_wait is 42.
But the vmstat graph is giving me a system of 0.0, user 1670.0, and idle of 2347945.2.
I've looked over the AIX setup for vmstat, and it should parse this output from the bottomfeeder OK.
Is this RRD file one that you have copied over from the BB/LARRD setup ? If yes, do the graphs make more sense if you delete the file from ~hobbit/data/rrd/HOSTNAME/vmstat.rrd and have Hobbit re-create the file (it does that automatically) ?
I'd like to have a look at that vmstat.rrd file to see if the problem is one of different data-set definitions in the Hobbit vs. LARRD version. I know the vmstat definitions are different on some systems, but I cannot remember if I changed them for AIX also.
Also -- I only get the vmstat graph for AIX - I'm missing vmstat0, vmstat2, vmstat3, and vmstat8. Where do I enable the graphing for these?
The data for them are being tracked, but by default only one vmstat graph is shown. Add "LARRD:*,vmstat:vmstat|vmstat0|vmstat2|vmstat3|vmstat8" to the entries in the bb-hosts where you want these.
And while I'm at it -- I need to disable the tracking and displaying of disk filesystem usage data -- virtually all my filesystems contain Oracle tablespaces and they are 100% full at the OS level shortly after creation -- so I can't see any reason to track them.
There's no way to turn off tracking of the data for disk reports (unless you configure the client not to send them, of course).
Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Sat, Apr 30, 2005 at 06:22:56PM -0500, Kauffman, Tom wrote:
These vmstat rrds are from back on larrd 42; just after the change to accumulate cpu wait. So I'm trying the vmstat recreate to see if the definitions I've got are severely non-standard (I'd almost bet on it)
Use the "rrdtool dump FILENAME.rrd" to dump the old data into a text file (XML) format. When you look at this file, at the top you'll find the data-sets definitions that LARRD has setup in this RRD file; these come from the "aix" definition in the old LARRD vmstat-larrd.pl script. So there should be (in sequence): cpu_r, cpu_b, mem_avm, mem_free, mem_re, mem_pi, mem_po, mem_fr, sr, mem_cy, cpu_int, cpu_syc, cpu_csw, cpu_usr, cpu_sys, cpu_idl, cpu_wait - at least, that's what Hobbit would generate, and therefore it assumes this layout when updating the RRD-file.
Since the files are being updated by Hobbit, but the data collected is wrong, my guess is that you have these in a different sequence than Hobbit expects.
There are two way of tackling that problem.
One way is to change the Hobbit layout to match your current RRD files. This layout is defined in the hobbit-4.0.3rc1/hobbitd/larrd/do_vmstat.c file - just look for "aix" and you'll see it. Only problem with this is that you'll need to repeat this change whenever you upgrade Hobbit.
The other way is to modify the dumped RRD-file, then use "rrdtool restore" to convert the modified XML-file back to an RRD file.
You need to change the sequence of the dataset definitions at the beginning of the file, and also change each of the data "rows" that make up the bulk of the file. These look like this:
<!-- 2005-05-01 02:00:00 CEST / 1114905600 --><row><v> 1.5896990741e-01 </v><v>2.1686840278e+02 </v><v> 9.5610891204e+01 </v><v> 3.5725331019e+02</v><v> 1.0420138889e-01 </v><v>8.3974537037e-01 </v><v> 3.3892245370e+00</v><v> 3.3494723380e+02 </v><v>9.9369259259e+01 </v><v> 1.0934771532e+05</v><v> 3.8053798435e+05 </v><v> 8.1690244444e+03 </v><v> 2.7122800926e+00 </v><v>1.2084837963e+00 </v><v> 2.1852577870e+05</v></row>
Each of the "<v> VALUE </v>" appear in the sequence that the datasets are defined. So you must swap values around to match the new layout.
On the disk space rrds -- this is a lot of wasted activity for us; we have about 8 filesystems we care about, and my production R3 DB server currently has 95 filesystems that have been 100% full since creation -- and we add another 10 every 13 months (150 GB -- SAP just *eats* disk).
I see - perhaps something like the attached patch could be used. With this, you can setup two environment variables that are regexp patterns that the filesystem name is matched against before they get graphed; NORRDDISKS is an "exclude" pattern - any filesystem name matching this do not get a graph, RDDISKS is an "include" pattern - only filesystem names matching this pattern get graphed. You can use none of them (the current behaviour), one of them or both.
E.g. if all of your SAP filesystems are mounted below "/sap", you would just put NORRDDISKS="^/sap" in hobbitserver.cfg, and they won't get graphed.
This doesn't affect any of the RRD files that have already been created, so you must manually clean out the unwanted disk*.rrd files from the ~hobbit/data/rrd/HOSTNAME/ directory to get rid of the graphs you don't want.
Henrik
On Sun, May 01, 2005 at 09:11:42AM +0200, Henrik Stoerner wrote:
One way is to change the Hobbit layout to match your current RRD files. This layout is defined in the hobbit-4.0.3rc1/hobbitd/larrd/do_vmstat.c file - just look for "aix" and you'll see it.
If you do use this method, just shuffle the lines around in the "aix" definition - dont change the numbers, because they are used to parse the vmstat output the client sends. And keep the "-1, NULL" line last.
-- Henrik Storner
participants (2)
-
henrik@hswn.dk
-
KauffmanT@nibco.com