[hobbit] Some more clear RRD duplicating-causing samples
Buchan- you should really capture the incoming data segment while simultaneously watching the rrd-data.log file; that way you can 100% let Henrik know which data chunks are causing RRD issues. In my case, I only had to "watch" for 5 minutes to capture all the instances of duplicztes happening.... --------------------------------------------------------- Kent C. Brodie - brodie at phys.mcw.edu Department of Physiology Medical College of Wisconsin (414) 456-8590 -----Original Message----- From: Buchan Milne [mailto:bgmilne at staff.telkomsa.net] Sent: Thursday, August 03, 2006 9:46 AM To: hobbit at hswn.dk Subject: Re: [hobbit] Some more clear RRD duplicating-causing samples On Thursday 03 August 2006 16:28, Brodie, Kent wrote:
Henrik, here you go: Two more netstat data chunks that cause the Duplication error (parsing issue?) in rrd/hobbit.
My simple observation is the one thing these two hosts have in common is virtual network interfaces, so instead of simpler items like eth0, eth1 and so on, you have eth0, eth0:0, and so forth. Linux in this case, but that would apply to solaris and so on as well. Just a hunch.................
@@data#98185|1154614736.024527|192.168.224.202||dunn.hmgc.mcw.edu|ifstat
data dunn,hmgc,mcw,edu.ifstat linux eth0 Link encap:Ethernet HWaddr 00:09:3D:13:DC:AB inet addr:192.168.224.105 Bcast:192.168.224.255 Mask:255.255.255.0 inet6 addr: fe80::209:3dff:fe13:dcab/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:40318794 errors:0 dropped:0 overruns:0 frame:0 TX packets:15352953 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2942042461 (2.7 GiB) TX bytes:21803054108 (20.3 GiB) Interrupt:185
eth0:0 Link encap:Ethernet HWaddr 00:09:3D:13:DC:AB inet addr:192.168.224.107 Bcast:192.168.224.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:185
eth0:1 Link encap:Ethernet HWaddr 00:09:3D:13:DC:AB inet addr:192.168.224.160 Bcast:192.168.224.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:185
lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:37737 errors:0 dropped:0 overruns:0 frame:0 TX packets:37737 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:6042388 (5.7 MiB) TX bytes:6042388 (5.7 MiB)
@@
@@data#98366|1154614859.203728|192.168.224.202||clifford.hmgc.mcw.edu|if
stat data clifford,hmgc,mcw,edu.ifstat linux eth0 Link encap:Ethernet HWaddr 00:09:3D:13:C8:78 inet addr:192.168.224.104 Bcast:192.168.224.255 Mask:255.255.255.0 inet6 addr: fe80::209:3dff:fe13:c878/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:146220628 errors:0 dropped:0 overruns:0 frame:0 TX packets:42544430 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:14048692258 (13.0 GiB) TX bytes:57506442876 (53.5 GiB) Interrupt:185
eth0:0 Link encap:Ethernet HWaddr 00:09:3D:13:C8:78 inet addr:192.168.224.106 Bcast:192.168.224.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:185
eth0:1 Link encap:Ethernet HWaddr 00:09:3D:13:C8:78 inet addr:192.168.224.159 Bcast:192.168.224.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:185
lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:1189289 errors:0 dropped:0 overruns:0 frame:0 TX packets:1189289 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:347179798 (331.0 MiB) TX bytes:347179798 (331.0 MiB)
@@
Hmm, now that I think about this more carefully, this issue could be affecting me as well, resulting in some gaps in various rrd files (which I first thought might be affected by running everything via bbproxy and a rather complex setup). In our environment we typically use bonding/teaming (eg with eth0 connected to one switch, eth1 connected to another, with the two interfaces set up in passive failover, appearing as one "bond0" device). Then, to ensure access to the right networks, we have VLANs (eg bond0.4, bond0.54, bond0106) on top of the bonded interface. In some cases (HA setups), there are virtual interfaces (eg bond0.4:0)on top of the bonded interface. I've included sample ifstat contents (but, not necessarily at the time I've seen gaps in the rrds): [ifstat] bond0 Link encap:Ethernet HWaddr 00:16:35:7E:8D:59 inet6 addr: fe80::216:35ff:fe7e:8d59/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:1353358109 errors:0 dropped:0 overruns:0 frame:0 TX packets:1414161412 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3466938915 (3.2 GiB) TX bytes:1361756811 (1.2 GiB) bond0.4 Link encap:Ethernet HWaddr 00:16:35:7E:8D:59 inet addr:192.168.211.41 Bcast:192.168.211.255 Mask:255.255.255.0 inet6 addr: fe80::216:35ff:fe7e:8d59/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:1353172280 errors:0 dropped:0 overruns:0 frame:0 TX packets:1414125729 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3747147076 (3.4 GiB) TX bytes:2930351274 (2.7 GiB) bond0.4:0 Link encap:Ethernet HWaddr 00:16:35:7E:8D:59 inet addr:192.168.211.203 Bcast:192.168.211.255 Mask:255.255.255.0 UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 bond0.54 Link encap:Ethernet HWaddr 00:16:35:7E:8D:59 inet addr:192.168.12.18 Bcast:192.168.15.255 Mask:255.255.252.0 inet6 addr: fe80::216:35ff:fe7e:8d59/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:15562 errors:0 dropped:0 overruns:0 frame:0 TX packets:220 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:952935 (930.6 KiB) TX bytes:9408 (9.1 KiB) bond0.106 Link encap:Ethernet HWaddr 00:16:35:7E:8D:59 inet addr:192.168.106.68 Bcast:192.168.106.255 Mask:255.255.255.0 inet6 addr: fe80::216:35ff:fe7e:8d59/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:173127 errors:0 dropped:0 overruns:0 frame:0 TX packets:38287 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:10379598 (9.8 MiB) TX bytes:3224568 (3.0 MiB) eth0 Link encap:Ethernet HWaddr 00:16:35:7E:8D:59 inet6 addr: fe80::216:35ff:fe7e:8d59/64 Scope:Link UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:1353358109 errors:0 dropped:0 overruns:0 frame:0 TX packets:1414161412 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3466938915 (3.2 GiB) TX bytes:1361756811 (1.2 GiB) Interrupt:201 eth1 Link encap:Ethernet HWaddr 00:16:35:7E:8D:59 inet6 addr: fe80::216:35ff:fe7e:8d59/64 Scope:Link UP BROADCAST NOARP SLAVE MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:209 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:1668785 errors:0 dropped:0 overruns:0 frame:0 TX packets:1668785 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:720576454 (687.1 MiB) TX bytes:720576454 (687.1 MiB) Regards, Buchan -- Buchan Milne ISP Systems Specialist B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)
On Thursday 03 August 2006 18:12, Brodie, Kent wrote:
Buchan- you should really capture the incoming data segment while simultaneously watching the rrd-data.log file; that way you can 100% let Henrik know which data chunks are causing RRD issues.
@@data#579958|1154700491.686221|10.10.9.32||shemali.telkomsa.net|proccounts data shemali,telkomsa,net.proccounts cron:1 @@ @@data#579959|1154700491.686591|10.10.9.32||shemali.telkomsa.net|portcounts data shemali,telkomsa,net.portcounts ssh:0 ldapclient:7 hobbitclient:0 @@ @@data#579960|1154700491.686716|10.10.9.32||shemali.telkomsa.net|netstat data shemali,telkomsa,net.netstat linux Ip: 934146872 total packets received 0 forwarded 0 incoming packets discarded 934079142 incoming packets delivered 1014866720 requests sent out Icmp: 2857 ICMP messages received 0 input ICMP message failed. ICMP input histogram: destination unreachable: 79 echo requests: 2778 2856 ICMP messages sent 0 ICMP messages failed ICMP output histogram: destination unreachable: 78 echo replies: 2778 Tcp: 15978 active connections openings 12472 passive connection openings 0 failed connection attempts 236 connection resets received 14 connections established 933371889 segments received 1014788334 segments send out 662519 segments retransmited 0 bad segments received. 24 resets sent Udp: 686592 packets received 78 packets to unknown port received. 0 packet receive errors 75535 packets sent TcpExt: 1 invalid SYN cookies received 116 packets pruned from receive queue because of socket buffer overrun ArpFilter: 0 19067 TCP sockets finished time wait in fast timer 7114110 delayed acks sent 5809 delayed acks further delayed because of locked socket Quick ack mode was activated 708 times 66719 packets directly queued to recvmsg prequeue. 1788 packets directly received from backlog 3286186 packets directly received from prequeue 752163521 packets header predicted 9728 packets header predicted and directly queued to user TCPPureAcks: 3672858 TCPHPAcks: 746701951 TCPRenoRecovery: 0 TCPSackRecovery: 125111 TCPSACKReneging: 0 TCPFACKReorder: 0 TCPSACKReorder: 0 TCPRenoReorder: 0 TCPTSReorder: 1 TCPFullUndo: 1 TCPPartialUndo: 9 TCPDSACKUndo: 0 TCPLossUndo: 381 TCPLoss: 505690 TCPLostRetransmit: 494 TCPRenoFailures: 0 TCPSackFailures: 1472 TCPLossFailures: 0 TCPFastRetrans: 519174 TCPForwardRetrans: 132661 TCPSlowStartRetrans: 2590 TCPTimeouts: 2034 TCPRenoRecoveryFail: 0 TCPSackRecoveryFail: 4542 TCPSchedulerFailed: 5 TCPRcvCollapsed: 13111 TCPDSACKOldSent: 708 TCPDSACKOfoSent: 0 TCPDSACKRecv: 1103 TCPDSACKOfoRecv: 0 TCPAbortOnSyn: 0 TCPAbortOnData: 5 TCPAbortOnClose: 10 TCPAbortOnMemory: 0 TCPAbortOnTimeout: 2 TCPAbortOnLinger: 0 TCPAbortFailed: 0 TCPMemoryPressures: 0 @@ @@data#579961|1154700491.687361|10.10.9.32||shemali.telkomsa.net|ifstat data shemali,telkomsa,net.ifstat linux bond0 Link encap:Ethernet HWaddr 00:12:79:D7:D7:1B inet6 addr: fe80::212:79ff:fed7:d71b/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:935050018 errors:0 dropped:0 overruns:0 frame:0 TX packets:1014881892 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:815414722 (777.6 MiB) TX bytes:3419943727 (3.1 GiB) bond0.4 Link encap:Ethernet HWaddr 00:12:79:D7:D7:1B inet addr:192.168.211.40 Bcast:192.168.211.255 Mask:255.255.255.0 inet6 addr: fe80::212:79ff:fed7:d71b/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:934813237 errors:0 dropped:0 overruns:0 frame:0 TX packets:1014856688 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1705163146 (1.5 GiB) TX bytes:3889683012 (3.6 GiB) bond0.4:4 Link encap:Ethernet HWaddr 00:12:79:D7:D7:1B inet addr:192.168.211.204 Bcast:192.168.211.255 Mask:255.255.255.0 UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 bond0.54 Link encap:Ethernet HWaddr 00:12:79:D7:D7:1B inet addr:192.168.12.17 Bcast:192.168.15.255 Mask:255.255.252.0 inet6 addr: fe80::212:79ff:fed7:d71b/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:25654 errors:0 dropped:0 overruns:0 frame:0 TX packets:55 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2296065 (2.1 MiB) TX bytes:2478 (2.4 KiB) bond0.106 Link encap:Ethernet HWaddr 00:12:79:D7:D7:1B inet addr:192.168.106.69 Bcast:192.168.106.255 Mask:255.255.255.0 inet6 addr: fe80::212:79ff:fed7:d71b/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:211968 errors:0 dropped:0 overruns:0 frame:0 TX packets:27252 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:11974449 (11.4 MiB) TX bytes:2811952 (2.6 MiB) eth0 Link encap:Ethernet HWaddr 00:12:79:D7:D7:1B inet6 addr: fe80::212:79ff:fed7:d71b/64 Scope:Link UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:934187736 errors:0 dropped:0 overruns:0 frame:0 TX packets:1014881888 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:751314594 (716.5 MiB) TX bytes:3419943441 (3.1 GiB) Interrupt:201 eth1 Link encap:Ethernet HWaddr 00:12:79:D7:D7:1B inet6 addr: fe80::212:79ff:fed7:d71b/64 Scope:Link UP BROADCAST RUNNING NOARP SLAVE MULTICAST MTU:1500 Metric:1 RX packets:862282 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:64100128 (61.1 MiB) TX bytes:286 (286.0 b) Interrupt:209 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:216 errors:0 dropped:0 overruns:0 frame:0 TX packets:216 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:71610 (69.9 KiB) TX bytes:71610 (69.9 KiB) @@ @@data#579962|1154700491.688080|10.10.9.32||shemali.telkomsa.net|vmstat data shemali,telkomsa,net.vmstat linux 0 2 1200 788904 1772 774328 0 0 3009 2047 8106 22738 0 18 55 27 @@ Regards, Buchan -- Buchan Milne ISP Systems Specialist B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)
On Fri, Aug 04, 2006 at 04:14:11PM +0200, Buchan Milne wrote:
On Thursday 03 August 2006 18:12, Brodie, Kent wrote:
Buchan- you should really capture the incoming data segment while simultaneously watching the rrd-data.log file; that way you can 100% let Henrik know which data chunks are causing RRD issues.
@@data#579960|1154700491.686716|10.10.9.32||shemali.telkomsa.net|netstat data shemali,telkomsa,net.netstat
I do not get any "duplicate match" errors when processing these. Could you double-check to make sure you've got the latest hobbitd_rrd module running - perhaps build it fresh from a current snapshot or from a source-tree with the latest all-in-one patch applied ? Regards, Henrik
participants (3)
-
bgmilne@staff.telkomsa.net
-
brodie@mcw.edu
-
henrik@hswn.dk