Is there any hope of enhancing the DNS check capability beyond its current functionality? It would be nice if it could detect all the NS for the domain you're monitoring to compare the SOA serial of all the NS servers and go red if they're not in sync.
Mark
I think more DNS checks would be really useful for many, but I would say that we'd be going down a rabbit hole chasing this. The DNS check you've described is worthwhile to do for many people (myself included), but is only one of many that would need to be done to ensure that a name or domain is resolvable.
For example, should the same checks be done for the parent zone(s)? Should we check the WHOIS record for impending zone expiry date? Should we check that there is more than one NS record? Should we check that the NS records don't all point at the same IP addresses? For high-turn-over (eg dynamic) zones, the masters nameservers might only rarely be in sync, or the serial number might typically change before all of the SOA lookups are complete. What about when there's a stealth master that can't be queried? What about reporting on slave zones that about to expire? Or zones that have semantic errors such as MX records that refer to CNAME records, or host records with underlines, or CNAME loops? Should we be checking DNSSEC signatures?
Hmm, that list turned into a bit of a rant, really. Sorry. You can probably guess that I think about this stuff a fair bit, and many of the things I've listed are more "niche" than others, but still.
For each possible test anyone might want to include, each installation might need different ways of reporting and/or recording statistics, and so it would get complex very quickly. Do you report a yellow if only 3 out of 4 NS servers are the same, or 7 out of 8? If the master's serial number somehow goes backwards, do we show seven servers wrong or is it just one? You you assume that the master is in the MNAME field, or would you get the option to override? If two hosts have different values for the MNAME field, which do you consider master? Or in this case, do you care? Also, which host(s) would you report the status against? Do you have to create hosts.cfg entries for every NS, and then maintain that list by tracking the NS records as they change over time, or do you create a pseudo-host for each domain, or some of each?
Woops, there I go ranting again, sorry.
Such complexity and flexibility is better implemented outside Xymon, to keep the Xymon core as simple and easy to maintain as possible.
I think the best solution is for each installer to decide on their own detection and reporting requirements, and create or install ext scripts to suit each case. In fact, I'm surprised there aren't any on Xymonton.org already, but that's where I would expect such code to reside. I'd be happy to assist with developing ext scripts for enhanced DNS checks.
J
On 8 January 2014 07:56, Mark Felder <feld at feld.me> wrote:
Is there any hope of enhancing the DNS check capability beyond its current functionality? It would be nice if it could detect all the NS for the domain you're monitoring to compare the SOA serial of all the NS servers and go red if they're not in sync.
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
On Jan 7, 2014, at 19:23, Jeremy Laidman <jlaidman at rebel-it.com.au> wrote:
Mark
I think more DNS checks would be really useful for many, but I would say that we'd be going down a rabbit hole chasing this. The DNS check you've described is worthwhile to do for many people (myself included), but is only one of many that would need to be done to ensure that a name or domain is resolvable.
For example, should the same checks be done for the parent zone(s)?
Why are you monitoring something so far out of your control? I don't monitor the ROOT servers; I trust my friends at Verisign & co to handle that for me.
Should we check the WHOIS record for impending zone expiry date?
No, and doing so is ignorant; you can easily get banned from WHOIS lookups for abusing it. Use the registrar's APIs.
Should we check that there is more than one NS record? Should we check that the NS records don't all point at the same IP addresses?
The question here is "Are the publicly accessible NS servers in a consistent functional state?". The goal is not to validate the data.
For high-turn-over (eg dynamic) zones, the masters nameservers might only rarely be in sync,
If you expect it to rarely be in sync why would you try to monitor for that?
or the serial number might typically change before all of the SOA lookups are complete.
Of course you'd expect the race condition where the check happens while a change is happening. Waiting for another check is a reasonable way to avoid a false positive.
What about when there's a stealth master that can't be queried?
I'm not monitoring from an untrusted network; I own these NS servers and can certainly get to my stealth master from my monitoring infrastructure. Also, the theme is "Are the publicly accessible NS servers in a consistent functional state?"
What about reporting on slave zones that about to expire?
I could see that as useful, but when the query starts failing it will go red. This would be really easy to do though...
Or zones that have semantic errors such as MX records that refer to CNAME records, or host records with underlines, or CNAME loops?
Again, we're not validating the data just making sure it can be served correctly which mostly amounts to no errors and the serials not being out of whack. This isn't the proper place for those kinds of checks.
Should we be checking DNSSEC signatures?
No. I wouldn't trust Xymon's implementation of that anyway; that's best handled by your OS's DNS stack. The check will fail if the signature is incorrect because the entire lookup will fail.
Hmm, that list turned into a bit of a rant, really. Sorry. You can probably guess that I think about this stuff a fair bit, and many of the things I've listed are more "niche" than others, but still.
I'd say most are niche :-/
For each possible test anyone might want to include, each installation might need different ways of reporting and/or recording statistics, and so it would get complex very quickly. Do you report a yellow if only 3 out of 4 NS servers are the same, or 7 out of 8?
If any NS are not at the same serial there should be concern. You have no control over which NS the client chooses. (side note: 7 NS is the max recommended by RFC 1912 anyway)
If the master's serial number somehow goes backwards, do we show seven servers wrong or is it just one?
Alert will happen because they're not in sync anyway. This is a problem for a human familiar with the environment to figure out once they've been informed.
You you assume that the master is in the MNAME field, or would you get the option to override?
"Are the publicly accessible NS servers in a consistent functional state?"
If two hosts have different values for the MNAME field, which do you consider master? Or in this case, do you care?
How is this even happening? This is not a multi-master infrastructure. If the MNAME is different the serial most certainly is as well or you've picked up axfer errors in the logs, etc.
Also, which host(s) would you report the status against? Do you have to create hosts.cfg entries for every NS, and then maintain that list by tracking the NS records as they change over time, or do you create a pseudo-host for each domain, or some of each?
I don't care. I'd probably end up doing "127.0.0.1 foo.com # noping fancydnscheck"
The error is telling me there's something wrong with the infrastructure and most likely will tell you which NS is the problem. I'm not interested in tying the event to a specific NS server hosts.cfg entry in Xymon because it's possible that there isn't one.
Woops, there I go ranting again, sorry.
Such complexity and flexibility is better implemented outside Xymon, to keep the Xymon core as simple and easy to maintain as possible.
I think the best solution is for each installer to decide on their own detection and reporting requirements, and create or install ext scripts to suit each case. In fact, I'm surprised there aren't any on Xymonton.org already, but that's where I would expect such code to reside. I'd be happy to assist with developing ext scripts for enhanced DNS checks.
J
On 8 January 2014 07:56, Mark Felder <feld at feld.me> wrote: Is there any hope of enhancing the DNS check capability beyond its current functionality? It would be nice if it could detect all the NS for the domain you're monitoring to compare the SOA serial of all the NS servers and go red if they're not in sync.
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Yes, lots of good rebuttals there. I think I have to agree that most of my proposed checks are quite niche, but the one you've proposed is probably the least niche of the lot. So if a sizeable number of Xymonsters could make of it without needing too much configuration (hence complex parsing code), then it's worthy of inclusion.
On 8 January 2014 13:44, Mark Felder <feld at feld.me> wrote:
The question here is "Are the publicly accessible NS servers in a consistent functional state?". The goal is not to validate the data.
So, I suppose the "object" you're trying to watch is the "NS" consistency state of the zone. So yes, you'd alert against the zone name such as what you've shown in your hosts.cfg example.
Although I can't speak for Henrik's design model, I do think that xymonnet is not geared up to monitor this type of object, and instead it expects its objects to all have either one IP address, or a name that resolves to an IP address.
So really, I think the answer to your original question is that the DNS check capability probably can not be easily enhanced to check something that doesn't look like a host.
But there's no reason I can think of that some zone check code couldn't be added into some other part of Xymon. But probably not as an extension to the existing DNS check.
Many of the internal checks seem to have been modelled on ext scripts.
J
Den 08-01-2014 06:07, Jeremy Laidman skrev:
On 8 January 2014 13:44, Mark Felder <feld at feld.me <mailto:feld at feld.me>> wrote:
The question here is "Are the publicly accessible NS servers in a consistent functional state?". The goal is not to validate the data.So, I suppose the "object" you're trying to watch is the "NS" consistency state of the zone. So yes, you'd alert against the zone name such as what you've shown in your hosts.cfg example.
Testing this would essentially do
dig example.com ns
<grab the list of dns servers>
dig @ns1 example.com soa
dig @ns2 example.com soa
<compare soa records to see if they are identical>
Xymon can do the DNS lookups, all that is needed is to cook up the necessary data analysis.
I think this should be a separate test from the normal "dns" column?
Regards, Henrik
On Wed, Jan 8, 2014, at 2:39, Henrik Størner wrote:
Den 08-01-2014 06:07, Jeremy Laidman skrev:
On 8 January 2014 13:44, Mark Felder <feld at feld.me <mailto:feld at feld.me>> wrote:
The question here is "Are the publicly accessible NS servers in a consistent functional state?". The goal is not to validate the data.So, I suppose the "object" you're trying to watch is the "NS" consistency state of the zone. So yes, you'd alert against the zone name such as what you've shown in your hosts.cfg example.
Testing this would essentially do
dig example.com ns <grab the list of dns servers> dig @ns1 example.com soa dig @ns2 example.com soa <compare soa records to see if they are identical>Xymon can do the DNS lookups, all that is needed is to cook up the necessary data analysis.
I think this should be a separate test from the normal "dns" column?
That is basically all I need at this point. Anything else would require a far more clever tool in my opinion. A separate column makes sense as well. Anything to avoid the unnecessary addition of more monitoring software that will get neglected. Right now we monitor nearly everything we need in Xymon and the thought of having to deploy Nagios or something else that implements this functionality is far from ideal. The fewer systems I have to train people on the better.
Hello,
I have installed the 4.3.12 version of XYMON after been used to work with 4.3.0 and all working well, except that strange behaviour: When displaying a windows with a RRD graph, it takes a long time, and in the Apache log I get the following message: Sendto failed: Destination address required
Any idea ?
Cordialement, Regards,Mit freundlichen Grüßen,
Gautier BEGIN
System Tools Team Lead CACEIS and APERAM accounts CSC Computer Sciences Luxembourg S.A. 12D Impasse Drosbach L-1882 Luxembourg
Global Outsourcing Service | p:+352 24 834 276 | m:+352 621 229 172 | gbegin at csc.com | www.csc.com
CSC • This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind CSC to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose • CSC Computer Sciences SAS • Registered Office: Immeuble Le Balzac, 10 Place des Vosges, 92072 Paris La Défense Cedex, France • Registered in France: RCS Nanterre B 315 268 664
On 8 January 2014 23:57, Gautier Begin <gbegin at csc.com> wrote:
I have installed the 4.3.12 version of XYMON after been used to work with 4.3.0 and all working well, except that strange behaviour: When displaying a windows with a RRD graph, it takes a long time, and in the Apache log I get the following message: *Sendto failed: Destination address required*
http://lists.xymon.com/archive/2012-November/035918.html
J
On 08/01/14 19:39, Henrik Størner wrote:
Den 08-01-2014 06:07, Jeremy Laidman skrev:
On 8 January 2014 13:44, Mark Felder <feld at feld.me <mailto:feld at feld.me>> wrote:
The question here is "Are the publicly accessible NS servers in a consistent functional state?". The goal is not to validate the data.So, I suppose the "object" you're trying to watch is the "NS" consistency state of the zone. So yes, you'd alert against the zone name such as what you've shown in your hosts.cfg example.
Testing this would essentially do
dig example.com ns <grab the list of dns servers> dig @ns1 example.com soa dig @ns2 example.com soa <compare soa records to see if they are identical>
Xymon can do the DNS lookups, all that is needed is to cook up the necessary data analysis.
I think this should be a separate test from the normal "dns" column?
IMHO, this really should be an ext test. In a couple of minutes, this might almost suffice with some small extra wrappers: /usr/bin/dnsqr ns mydomain.com|grep ^answer|awk '{ print $5 }'| while read server;do /usr/bin/dnsq soa mydomain.com $server|grep ^answer:|awk '{ print $7 }';done|uniq|wc -l
If the answer is 1, then green, anything else is red.
Of course, I've only tested this against my own domain on my NS, and it worked, but your results may vary. You should add a whole bunch of error checking to ensure that each lookup is successful, returns valid results, etc... The main reason for doing this was to simply see how hard it would be to complete the task using the djb tools as they are claimed to be easier to use for scripting. I didn't actually attempt this with the bind tools, but past experience suggests it would be more difficult to parse the correct answers/etc...
Sharing was just in case it is useful to others...
Regards, Adam
Adam Goryachev Website Managers www.websitemanagers.com.au
Den 10-01-2014 01:02, Adam Goryachev skrev:
On 08/01/14 19:39, Henrik Størner wrote:
I think this should be a separate test from the normal "dns" column?
IMHO, this really should be an ext test.
You mean, like this?
#!/bin/sh
DOMAIN=$1; shift
SERVERS="host -tNS $DOMAIN | awk '{print $4}' | xargs echo"
if test -f /tmp/soarec-$DOMAIN.$$ then rm -f /tmp/soarec-$DOMAIN.$$ fi
for H in $SERVERS
do
SOAREC=host -tSOA $DOMAIN $H | grep SOA
if test $? -eq 0
then
echo "$H: $SOAREC" >> /tmp/soarec-$DOMAIN.$$
fi
done
RECCOUNT=cat /tmp/soarec-$DOMAIN.$$ | cut -d: -f2- | sort | uniq -c | wc -l
if test $RECCOUNT -eq 1 then echo "green: All SOA records match" else echo "red: Different SOA records on the nameservers" fi
cat /tmp/soarec-$DOMAIN.$$ rm -f /tmp/soarec-$DOMAIN.$$
exit 0
Modifying this to actually send a status report is left as an excercise for the reader :-) While you're at it, make it use "xymongrep" to pick up the domains you want to test.
Regards, Henrik
On Tue, Jan 7, 2014, at 23:07, Jeremy Laidman wrote:
Yes, lots of good rebuttals there. I think I have to agree that most of my proposed checks are quite niche, but the one you've proposed is probably the least niche of the lot. So if a sizeable number of Xymonsters could make of it without needing too much configuration (hence complex parsing code), then it's worthy of inclusion.
On 8 January 2014 13:44, Mark Felder <feld at feld.me> wrote:
The question here is "Are the publicly accessible NS servers in a consistent functional state?". The goal is not to validate the data.
So, I suppose the "object" you're trying to watch is the "NS" consistency state of the zone. So yes, you'd alert against the zone name such as what you've shown in your hosts.cfg example.
Although I can't speak for Henrik's design model, I do think that xymonnet is not geared up to monitor this type of object, and instead it expects its objects to all have either one IP address, or a name that resolves to an IP address.
So really, I think the answer to your original question is that the DNS check capability probably can not be easily enhanced to check something that doesn't look like a host.
But there's no reason I can think of that some zone check code couldn't be added into some other part of Xymon. But probably not as an extension to the existing DNS check.
Many of the internal checks seem to have been modelled on ext scripts.
J
Thanks for the feedback on everything. It's nice to know I'm not talking to myself on this list :-) You've also made me consider a few scenarios I previously didn't carefully consider, but I have some other ideas on how to prevent those errors by enhancing the procedure used to change DNS entries.
participants (5)
-
feld@feld.me
-
gbegin@csc.com
-
henrik@hswn.dk
-
jlaidman@rebel-it.com.au
-
mailinglists@websitemanagers.com.au