I started replying to Charles' mail a couple of days ago, then waited for a few days before coming back to it. This does cover a lot of different areas so I'm rambling a bit, but bear with me for continuing off the tangent that Charles set out on.
On Fri, Jan 13, 2006 at 02:59:06PM -0700, Charles Jones wrote:
I believe that a well-setup hobbit monitor is superior to Nagios and other tools I have tested and been forced to use over the years. But the fact that a lot of the application-specific monitoring (mysql, oracle, postgres, etc), as well as traffic monitoring (MRTG) is handled by third-party scripts that you have to meld into your server probably scares away a lot of people, especially management types who have security folks whispering in their ear to never trust third-party modules and especially not code written by "joe-user from some website" (a manager actually said that to me once). As of yet Hobbit does not even have a fully functional client (no logfile parsing), so we have to use either the bb-client or the bb-msgs script....more third party plugins.
True. It's one of the things that need to be fixed soon.
I understand your management concerns - I've heard the same stuff over the years. I am fortunate to have a superior who saw the potential for BB and then Hobbit, because Unicenter/TNG couldn't do what we needed - and stuck with that decision for three years until everyone could see that Hobbit is a very good solution to our monitoring needs.
One thing that I've learned over the years is that to get your management interested in Hobbit, you must show them something that they can see is useful. Geeks like myself often get lost in the wonders of technology - "look, it can match the login JSP against this regular expression so we can see if all of the EJB ressources are OK!" ... forget it when talking to bosses. What they want is graphs, reports, and the knowledge that whenever something happens that might affect their bonus, one of the techies will be alerted and take action.
Big Brother was a "geek thing" when I started using it. OK, it was better than Unicenter because you could check if the websites we hosted were OK without any special monitoring console installed. But it didn't *really* catch on until I added LARRD to get the response-time graphs, and started generating reports that showed the monthly availability. That's when management found out that "hey, this looks good" and gave the OK for me to work on it. I still nurse them a bit on occasion - just last week, they complained that a particular kind of reporting to the customer was difficult, so I came up with a tool that pulled all of the data from Hobbit and presented it so that it could be cut-and-pasted into a Word document. A simple one-hour job, but it gives immense credit.
So - listen to your PHB's and try to figure out what it is that they *really* want for a killer feature. If it turns out Hobbit cannot do that out of the box, speak up - it's happened more than once that an essential improvement required only a few lines of code in the right spot. Showing people how quickly Open Source tools can adapt to *your* needs is pretty powerful.
I'm not sure where I'm going with this, I guess what I'm saying is I would like to see Hobbit come with built-in support for monitoring common applications and services (besides the basics). It's already partway there as Hobbit can natively check things like mysql, but what about postgres, oracle?
Would be nice, yes. It can do the Oracle TNS listener, but detailed Oracle checks are missing. Or PostgreSQL, for that matter.
I'm a bit cautious about making the "core" Hobbit tools know everything about anything out there. It would be a maintenance nightmare since there would be lots of stuff that I have no way of testing myself. The add-on mechanism is a bit more work for the admin, but I think people need monitoring for lots of very diverse stuff, and trying to cover all of it would just result in lots of not-very-good solutions. So what I prefer to do is to make Hobbit flexible enough that writing add-ons is easy, and therefore the people who *know* about what's interesting to look at in a DB2 database installation (or whatever it is they want to monitor) can put in the missing pieces and come up with a good monitoring solution.
It's also a lot easier to weed out the bugs in an add-on module, so it it turns out to be generally useful, I have something that has been debugged which I can merge into Hobbit as part of the core toolset.
This brings us to the issue of a repository for Hobbit add-ons:
One thing I really could need some help with is setting up a web repository like deadcat for Hobbit things. Sourceforge could host it, I suppose, but it would need someone to set it up and manage submissions - I don't think Sourceforge has anything automated like deadcat.
Henrik is a busy guy I am sure, and he probably doesn't get much compensation for all the fine work he does on Hobbit, nor does he ask for any (I did buy him one of his wishlist items, I hope others do as well). As far as I know, Henrik has nobody helping him, except for seeing him mention someone was working on a new Hobbit client. Maybe what we need is more people to roll up their sleeves and write some modules that are compatible with hobbit with little or no tweaking.
Thanks :-) Yes, I am fairly busy - got a job to attend to on occasion - so you are right that I could use some help with add-ons for Hobbit.
A colleague of mine *is* working on the Win32 client, so that is well underway. He's got some data flowing now, but there are still some pieces missing.
Sadly I'm no C/C++ guru, but I am pretty good with Perl :-)
I'm just the opposite :-) But that shouldn't keep you back - as long as you "only" write add-on modules there's a great deal of freedom in choosing your tools. Even Hobbit server modules - those that get their input directly from the hobbitd daemon - can be written in any language, since the interface is just reading standard-input. Add-on tests obviously just use the "bb" commandline tool to send their results.
I do appreciate anyone helping with improving Hobbit. But I am also concerned about becoming a bottleneck for getting things published. That's why I would like to have this repository setup so there's an easy way of publishing add-ons, without having to wait for me. If some of you want to gang up and do something together, I can quickly setup a dedicated mailing list for you - if that makes it easier.
(There are over 300 adresses on the hobbit mailing list now ... a year ago, I was proud when it passed the subscription #100. And almost 1500 downloads of the latest version from sourceforge - it's getting big).
The past 18 months have been pretty intense - Hobbit has developed very quickly. I think that will continue for another year or so; there is some stuff I see as needing work right now:
- the client package needs logfile monitoring badly.
- the alert/acknowledge mechanism needs improving, so it can handle things like escalating alerts and different groups acknowledging an alert (this is in fact alreay being worked on).
- the graph displays (which graphs go on which pages) needs an overhaul. The current system is a bit of a mess, and not flexible enough.
- I want to be able to trigger status-changes (and hence alerts) from the data that only goes into the graphs, currently. E.g. instead of the CPU alert triggering if the load average goes above 5 (which is pretty meaningless nowadays), I'd like it to trigger a warning if the %system time exceeds 20, or the %idle goes below 10. Or a "conn" alert if pingtime exceeds 250 ms. (I have a pretty solid idea about how this can be implemented - and it's elegant enough that it would also work for data from custom graphs).
- And I'd like to make the webpages 100% dynamic and ditch the statically generated overview pages. Which could mean that Hobbit would require something like PHP for the display part, or that I need to learn about how XML/XSLT etc. works.
So I won't get bored right away, but I do hope that development could slow down just a bit and there would be more activity just to broaden the range of systems/applications/whatever that Hobbit can monitor.
Henrik