On Thu, Jun 29, 2006 at 05:21:06PM -0400, Paul Moore wrote:
I'm having a problem with the hobbit-clients.cfg matching. The documentations states the below but I'm experiencing it matching all lines.
Rules are evaluated from the top of this file and down, and the first
matching rule is used. So you should put the specific rules first, and
the generic rules last.
This is actually misleading, as you've found out. It only applies to some of the client checks, not all of them since it wouldn't make sense for all. E.g. you do want to check all of the PROC settings for a host, not just the first one.
I'll remove that comment from the default setup.
Here is the layout of my cfg file. HOST=pdcd20-ic1 PROC sm_server 9 9 RED HOST=pdcd20-ic1 PROC sm_server 5 5 RED
What exactly is the point in having both of these rules ? I cannot tell if you want 5 or 9 sm_server processes.
HOST=%(c[ar]y|omz|rch|[pn]dc)(d20-ic[1-3]|d15-ic[3-5]|d[0-9]*-icb).* PROC sm_server 2 2 RED
This one also matches the pdcd20-ic1 host. So now there are three rules.
HOST=%(c[ar]y|omz|rch|[pn]dc)d[0-9]*-ic.* PROC sm_server 1 1 RED
And this one also matches pdcd20-ic1. That makes 4 rules total.
Process sm_server color red: Count=2, min=9, max=9 Process sm_server color red: Count=2, min=5, max=5 Process sm_server color green: Count=2, min=2, max=2 Process sm_server color red: Count=2, min=1, max=1
-=-=- As you can see it matches 4 different times.
Yep ...
Regards, Henrik
What exactly is the point in having both of these rules ? I cannot tell if you want 5 or 9 sm_server processes.
--For testing purposes to see if my regexp's where wrong or the docs where wrong :)
We wrote the regexp's thinking it would bail out when it hit the first one and where surprised when it matched the two of the regexp ( the specific and the default catch all).
I was thinking about digging into the code to see if it would be possible to have it exit once it hits the first match. Do you see this as being a major endeavor? It's either that, write out over 40 individual host lines, or write some exclude's.
Thanks in advance. Sean
-----Original Message----- From: Henrik Stoerner [mailto:henrik at hswn.dk] Sent: Wednesday, July 05, 2006 4:51 PM To: hobbit at hswn.dk Subject: Re: [hobbit] hobbit-clients.cfg
On Thu, Jun 29, 2006 at 05:21:06PM -0400, Paul Moore wrote:
I'm having a problem with the hobbit-clients.cfg matching. The documentations states the below but I'm experiencing it matching all lines.
Rules are evaluated from the top of this file and down, and the first
matching rule is used. So you should put the specific rules first, and
the generic rules last.
This is actually misleading, as you've found out. It only applies to some of the client checks, not all of them since it wouldn't make sense for all. E.g. you do want to check all of the PROC settings for a host, not just the first one.
I'll remove that comment from the default setup.
Here is the layout of my cfg file. HOST=pdcd20-ic1 PROC sm_server 9 9 RED HOST=pdcd20-ic1 PROC sm_server 5 5 RED
What exactly is the point in having both of these rules ? I cannot tell if you want 5 or 9 sm_server processes.
HOST=%(c[ar]y|omz|rch|[pn]dc)(d20-ic[1-3]|d15-ic[3-5]|d[0-9]*-icb).* PROC sm_server 2 2 RED
This one also matches the pdcd20-ic1 host. So now there are three rules.
HOST=%(c[ar]y|omz|rch|[pn]dc)d[0-9]*-ic.* PROC sm_server 1 1 RED
And this one also matches pdcd20-ic1. That makes 4 rules total.
Process sm_server color red: Count=2, min=9, max=9 Process sm_server color red: Count=2, min=5, max=5 Process sm_server color green: Count=2, min=2, max=2 Process sm_server color red: Count=2, min=1, max=1
-=-=- As you can see it matches 4 different times.
Yep ...
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Wed, Jul 05, 2006 at 05:18:07PM -0400, Sean Hennessey wrote:
What exactly is the point in having both of these rules ? I cannot tell if you want 5 or 9 sm_server processes.
--For testing purposes to see if my regexp's where wrong or the docs where wrong :)
Fair enough. Although for regexp testing, I'd recommend using the "pcretest" utility that comes with the PCRE library. It's the best way of doing that.
I was thinking about digging into the code to see if it would be possible to have it exit once it hits the first match. Do you see this as being a major endeavor? It's either that, write out over 40 individual host lines, or write some exclude's.
It's not really possible, because logically it doesn't make sense. Consider this: You have a bunch of "java" processes you want to check - there must be at least 10. And then there's a specific "java -Dfoobar" process that must exist, but only once. So: PROC java 10 PROC %java.*-Dfoobar 1 1
If you exit after the first match, then the second rule is ignored, and you won't get the check for that specific java process. So maybe we can reverse the lines: PROC %java.*-Dfoobar 1 1 PROC java 10
Now you get the right check for the "java ... -Dfoobar" process, but that one won't be counted along with the other "java" processes - and that doesn't seem right either.
So I still believe the current way that Hobbit does it is "right". In your case, how about
HOST=pdcd20-ic1 PROC sm_server 9 9 RED
HOST=%(c[ar]y|omz|rch|[pn]dc)(d20-ic[1-3]|d15-ic[3-5]|d[0-9]*-icb).* PROC sm_server 2 2 RED EXHOST=pdcd20-ic1
HOST=%(c[ar]y|omz|rch|[pn]dc)d[0-9]*-ic.* PROC sm_server 1 1 RED EXHOST=pdcd20-ic1
Regards, Henrik
It's not really possible, because logically it doesn't make sense. Consider this: You have a bunch of "java" processes you want to check - there must be at least 10. And then there's a specific "java -Dfoobar" process that must exist, but only once. So: PROC java 10 PROC %java.*-Dfoobar 1 1
If you exit after the first match, then the second rule is ignored, and you won't get the check for that specific java process. So maybe we can reverse the lines: PROC %java.*-Dfoobar 1 1 PROC java 10
Now you get the right check for the "java ... -Dfoobar" process, but that one won't be counted along with the other "java" processes - and that doesn't seem right either.
So I still believe the current way that Hobbit does it is "right". In your case, how about
HOST=pdcd20-ic1 PROC sm_server 9 9 RED
HOST=%(c[ar]y|omz|rch|[pn]dc)(d20-ic[1-3]|d15-ic[3-5]|d[0-9]*-icb).* PROC sm_server 2 2 RED EXHOST=pdcd20-ic1
HOST=%(c[ar]y|omz|rch|[pn]dc)d[0-9]*-ic.* PROC sm_server 1 1 RED EXHOST=pdcd20-ic1
I guess I don't understand how you are building up your parsing tree. I was thinking you built up some sort of structure for each host listed in the hobbit-clients.cfg listing all the test for that particular host. They you'd just run down that list looking for a match on the host. When you found a match then go into and run the test specified for that host. It sounds to me from what you are saying above that you actually build up a list of test's first. You compare the sent data to the parsed tests, then you look at the host. What you are saying makes sense if it works this way. If it works they way I originally thought it did (check host first then test) I still think exiting after hitting the first matched host is the better way to go.
Say I have for example
HOST=pdcd20-ic1 PROC sm_server 1 1 RED HOST=pdcd20-ic2 PROC sm_server 2 2 RED
Pdcdd20-ic1 sends its update and it has 5 sm_server process, can you tell me the flow that happens?
Does it take the hostname as sent (pdcd20-ic1) and then look that host up in the clients.cfg file to see it should have only 1 sm_server or does it look at all the PROC directives first and then look at the host?
Thanks in advance.
Great product by the way.
Sean
On Thu, Jul 06, 2006 at 10:52:42AM -0400, Sean Hennessey wrote:
Say I have for example
HOST=pdcd20-ic1 PROC sm_server 1 1 RED HOST=pdcd20-ic2 PROC sm_server 2 2 RED
Pdcdd20-ic1 sends its update and it has 5 sm_server process, can you tell me the flow that happens?
Does it take the hostname as sent (pdcd20-ic1) and then look that host up in the clients.cfg file to see it should have only 1 sm_server or does it look at all the PROC directives first and then look at the host?
Mostly it's the second way you describe, but it's actually a two-step process.
hobbitd_client parses the hobbit-clients.cfg file, and stores it in an internal data-structure - really just a linked list with a parsed copy of the rules. I.e. there are the strings or regular expressions to match hostnames/testnames etc against, the thresholds and so on. At this point, there is no relation to any particular host, it's just a list of the rules in hobbit-clients.cfg.
When a client message arrives, the rule-list is scanned for PROC rules that match the particulars for this host, and each line in the "ps" output is tested against the process-name that the rule is about. While doing this, it merely counts how many matches are found in the "ps" output for each of the relevant PROC rules. So one line from the ps output can add a count to several PROC rules, not just one.
Finally, all of the relevant PROC rules are examined, and the actual process count is compared to the thresholds listed in the rule. Based on this, the "procs" status color is decided, and the "procs" status text is generated with info about the rules that failed or succeeded.
So in your example, it would first count the number of "sm_server" processes, and both of the PROC rules would get a count of 5 active processes matching the rule. So when the "procs" status is generated, you should actually see two rules listed as "red" - one saying that hobbit found 5 sm_server processes while expecting just 1, and the other saying it found 5 sm_server processes while expecting only 2.
So the problem with your proposal is that by the time the final step happens - when hobbit goes through all of the PROC rules to see which ones has the right count of active processes - by then any association with a specific process name has been lost. So there is no way that Hobbit knows that one rule comes before another, and relates to the same process. In your example configuration, they both happen to look for an "sm_server" process; we *could* choose to make this special setup for rules that look for the exact same processname. But what if you wrote the rules as
HOST=bla PROC %[st]m_server 1 1 HOST=bla PROC sm_server 2 2
and there is 1 sm_server processes active ? The two patterns are different, so how can hobbit tell that it should use the first one and ignore the second one ? And what should it do if there is a tm_server process running ? Use only the first rule and ignore the second?
Great product by the way.
After all :-))
Thanks, Henrik
participants (2)
-
henrik@hswn.dk
-
sean.hennessey1@verizonbusiness.com