Now this is what I'm talking about!
Thanks to Ralph, Asif, Larry, Frank and of course, Henrik.
The servers being tested and constantly being improved, but at the
moment, can take several seconds to respond. While the server is
busy, it chews an entire CPU, so when Hobbit's net tests run in
parallel and hit all 8 servers on the host at once, host runs out of
CPU.
I've dropped the hobbit concurrency down from 512 to 50, but hasn't
had much of an effect.
The patch sounds like it'll do the trick.
Thanks!
-dave
On Feb 14, 2009, at 3:05 AM, Henrik Størner wrote:
On Fri, Feb 13, 2009 at 06:01:44PM +0100, Frank Gruellich wrote:
OTOH with so many servers we can't manage several monitoring groups keeping in mind how many checks are in one groups and which one still has "free slots" available. Having such a randomization and
spreading in time integrated into bbtest-net would be really great. (Thinking about it, spreading in time is maybe difficult because you never know how long all tests will take. Randomization of test order should be quite simple. But I don't know anything about hobbit internals.)You're right that randomizing the sequence of tests is simple - the attached patch against 4.3.0 should do that nicely.
Completely untested, but it should work :-) It won't be difficult to port over to the 4.2.3 version if needed.
Spreading things out over a longer time requires much more of a re-design. That may happen - I have some ideas about doing a major re-design of how network tests are done - but it will be a while before that evolves into any code.
Regards, Henrik
<randomize-tests.patch>To unsubscribe from the hobbit list, send an
e-mail to hobbit-unsubscribe at hswn.dk
-- Dave Paper cerberus at ginch.org
"Hello, I must be going." --Groucho