Feature request - thresholds for CPU utilisation (not load average)
Something I have been wondering about for a while is whether it would be possible to have thresholds on the CPU utilisation. While we have thresholds for load averages, in some cases these have to be relatively high (e.g. 2 to 4 times the number of CPUs) due to the impact of IO wait on load average (e.g, our SAN-attached NFS servers often have a load average of over 10, with a CPU utilisation of 50%, when reading over 10k blocks/sec). However, it then makes it difficult to catch a process in CPU-race (as much less IO gets done, IO wait is low, and load average is almost exactly 1 *CPUs).
The CPU utilisation is already reported (in the vmstat data), which is how I know the above about our NFS servers (vmstat/vmstat1 graph).
This would also remove the complication of thresholds differing between servers with different numbers of CPUs, and maybe work better for Windows clients (which don't seem to have a concept of load average).
(I don't mean thresholds for load average should be removed ... I would love to have thresholds for both load average and CPU utilisation).
Regards, Buchan
Funny you brought this up just now, because today I noticed if you load the windows client, either bbwin or bbnt, those allow you to set alerts for CPU utilization, but both big brother and hobbit only understand load average, so I keep getting alerts saying load is very high, when the cpus are around 20-50% Well a load of 20 on a linux/unix server would be very high, but Windows boxes don't really have the load average concept, just the cpu utilization, so if you are monitoring utilization on windows clients you have to change the load to something like 70 90 to avoid getting red pages.
-----Original Message----- From: Buchan Milne [mailto:bgmilne at staff.telkomsa.net] Sent: Thursday, February 28, 2008 12:44 PM To: hobbit at hswn.dk Subject: [hobbit] Feature request - thresholds for CPU utilisation (not load average)
Something I have been wondering about for a while is whether it would be
possible to have thresholds on the CPU utilisation. While we have thresholds for load averages, in some cases these have to be relatively high (e.g. 2 to 4 times the number of CPUs) due to the impact of IO wait on load average
(e.g, our SAN-attached NFS servers often have a load average of over 10, with a CPU utilisation of 50%, when reading over 10k blocks/sec). However, it then makes it difficult to catch a process in CPU-race (as much less IO gets done, IO wait is low, and load average is almost exactly 1 *CPUs).
The CPU utilisation is already reported (in the vmstat data), which is how I know the above about our NFS servers (vmstat/vmstat1 graph).
This would also remove the complication of thresholds differing between servers with different numbers of CPUs, and maybe work better for Windows clients (which don't seem to have a concept of load average).
(I don't mean thresholds for load average should be removed ... I would love to have thresholds for both load average and CPU utilisation).
Regards, Buchan
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
I'll second that.
I just found out we had a test system that has had an oracle process using 99% of one cpu for the past (drumroll!) two months and we didn't notice it!
Tom Kauffman NIBCO, Inc
-----Original Message----- From: Buchan Milne [mailto:bgmilne at staff.telkomsa.net] Sent: Thursday, February 28, 2008 1:44 PM To: hobbit at hswn.dk Subject: [hobbit] Feature request - thresholds for CPU utilisation (not load average)
Something I have been wondering about for a while is whether it would be possible to have thresholds on the CPU utilisation. While we have thresholds for load averages, in some cases these have to be relatively high (e.g. 2 to 4 times the number of CPUs) due to the impact of IO wait on load average (e.g, our SAN-attached NFS servers often have a load average of over 10, with a CPU utilisation of 50%, when reading over 10k blocks/sec). However, it then makes it difficult to catch a process in CPU-race (as much less IO gets done, IO wait is low, and load average is almost exactly 1 *CPUs).
The CPU utilisation is already reported (in the vmstat data), which is how I know the above about our NFS servers (vmstat/vmstat1 graph).
This would also remove the complication of thresholds differing between servers with different numbers of CPUs, and maybe work better for Windows clients (which don't seem to have a concept of load average).
(I don't mean thresholds for load average should be removed ... I would love to have thresholds for both load average and CPU utilisation).
Regards, Buchan
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
CONFIDENTIALITY NOTICE: This email and any attachments are for the exclusive and confidential use of the intended recipient. If you are not the intended recipient, please do not read, distribute or take action in reliance upon this message. If you have received this in error, please notify us immediately by return email and promptly delete this message and its attachments from your computer system. We do not waive attorney-client or work product privilege by the transmission of this message.
Thirdsies!
On 2/28/08, Kauffman, Tom <KauffmanT at nibco.com> wrote:
I'll second that.
I just found out we had a test system that has had an oracle process using 99% of one cpu for the past (drumroll!) two months and we didn't notice it!
Tom Kauffman NIBCO, Inc
-----Original Message----- From: Buchan Milne [mailto:bgmilne at staff.telkomsa.net] Sent: Thursday, February 28, 2008 1:44 PM To: hobbit at hswn.dk Subject: [hobbit] Feature request - thresholds for CPU utilisation (not load average)
Something I have been wondering about for a while is whether it would be possible to have thresholds on the CPU utilisation. While we have thresholds for load averages, in some cases these have to be relatively high (e.g. 2 to 4 times the number of CPUs) due to the impact of IO wait on load average (e.g, our SAN-attached NFS servers often have a load average of over 10, with a CPU utilisation of 50%, when reading over 10k blocks/sec). However, it then makes it difficult to catch a process in CPU-race (as much less IO gets done, IO wait is low, and load average is almost exactly 1 *CPUs).
The CPU utilisation is already reported (in the vmstat data), which is how I know the above about our NFS servers (vmstat/vmstat1 graph).
This would also remove the complication of thresholds differing between servers with different numbers of CPUs, and maybe work better for Windows clients (which don't seem to have a concept of load average).
(I don't mean thresholds for load average should be removed ... I would love to have thresholds for both load average and CPU utilisation).
Regards, Buchan
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
CONFIDENTIALITY NOTICE: This email and any attachments are for the exclusive and confidential use of the intended recipient. If you are not the intended recipient, please do not read, distribute or take action in reliance upon this message. If you have received this in error, please notify us immediately by return email and promptly delete this message and its attachments from your computer system. We do not waive attorney-client or work product privilege by the transmission of this message.
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
-- Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373
Those who don't understand UNIX are condemned to reinvent it, poorly. --- Henry Spencer
participants (4)
-
bgmilne@staff.telkomsa.net
-
josh@imaginenetworksllc.com
-
KauffmanT@nibco.com
-
tlewick@tradebotsystems.com