Something I have been wondering about for a while is whether it would be possible to have thresholds on the CPU utilisation. While we have thresholds for load averages, in some cases these have to be relatively high (e.g. 2 to 4 times the number of CPUs) due to the impact of IO wait on load average (e.g, our SAN-attached NFS servers often have a load average of over 10, with a CPU utilisation of 50%, when reading over 10k blocks/sec). However, it then makes it difficult to catch a process in CPU-race (as much less IO gets done, IO wait is low, and load average is almost exactly 1 *CPUs).
The CPU utilisation is already reported (in the vmstat data), which is how I know the above about our NFS servers (vmstat/vmstat1 graph).
This would also remove the complication of thresholds differing between servers with different numbers of CPUs, and maybe work better for Windows clients (which don't seem to have a concept of load average).
(I don't mean thresholds for load average should be removed ... I would love to have thresholds for both load average and CPU utilisation).
Regards, Buchan