Chad
What does the director do? How does it communicate with the servers?
Does the director or the server create a log message when there's a problem? Xymon can detect and alarm on that.
Does the director connect to the servers on a specific TCP port? If that port is rejecting a connection, the Xymon server can test for that (every 5 minutes, but can be more often) and alarm on it.
When a server fails, does it stop listening on a particular TCP port? Or perhaps a process crashes and restarts, causing the connection to fail? Xymon can test for these and alarm when it detects a missing listening or established TCP socket, or a missing process.
It's also possible to write a script to have Xymon look at the process listing "ps" output, and look for a particular process's lifetime, and alert when it's less than 5 minutes.
One thing to note is that Xymon's probes and processes typically look for things every 5 minutes. Transient failures that come and go within a few seconds may not be detected using the standard probes and checks. However, the frequency of some of these probes can be increased to make it more likely to catch failures. But also, a custom script can be written to check the state as often as you need. However for transient faults, it's more reliable to look for artefacts of a failure (log errors and warnings, short process lifetime) rather than periodically checking for a successful state.
J
On 22 June 2017 at 22:59, Chad Rodriguez <CHrodriguez at petsmart.com> wrote:
Symptom, we open up director and see application servers not communicating at the same time we can ping server by hostname and IP>
Respectfully,
Chad Rodriguez | Systems Administrator
19601 N. 27th Ave., Phoenix, AZ 85027
office: 623-587-2385 | fax: 623-580-6117
email – chrodriguez at petsmart.com
[image: PetSmart_logo_email.jpg]
*Upcoming Out-of-Office dates**:*
*June 26th through July 4th*
*July 21st*
*From:* Jeremy Laidman [mailto:jlaidman at rebel-it.com.au] *Sent:* Thursday, June 22, 2017 3:23 AM *To:* Chad Rodriguez <CHrodriguez at PetSmart.com> *Cc:* xymon at xymon.com *Subject:* Re: [Xymon] questions
Chad
Situations like what exactly? When a server is rebooted? Or when a server stops communicating? Can you explain what symptoms? What is a "director"? Sorry, I'm not familiar with the Solarwinds product.
Out of the box, Xymon can detect a few different types of communication issues (eg ping checks, TCP port responses) as well as monitoring logfiles for messages that indicate trouble. Furthermore, Xymon is highly extensible, so if you can write a script to perform a test for your problem, you can turn it into a message for Xymon to display, and optionally alarm via email or other means.
Cheers
Jeremy
On 22 June 2017 at 07:07, Chad Rodriguez <CHrodriguez at petsmart.com> wrote:
We have no monitoring in place other than solarwinds which monitors heartbeats. Essentially we have a few application servers that are randomly not communicating with the director and were having to reboot them. Seeing if your application would alert on situations like this in an email notification format?
Respectfully,
Chad Rodriguez | Systems Administrator
19601 N. 27th Ave., Phoenix, AZ 85027
office: 623-587-2385 | fax: 623-580-6117
email – chrodriguez at petsmart.com
[image: PetSmart_logo_email.jpg]
*Upcoming Out-of-Office dates**:*
*June 26th through July 4th*
*July 21st*
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Disregard, I thought your tool was specific to Xenapp/Citrix based application monitoring. Sorry for the bother, I’ll look elsewhere for a solution.
Respectfully,
Chad Rodriguez | Systems Administrator 19601 N. 27th Ave., Phoenix, AZ 85027 office: 623-587-2385 | fax: 623-580-6117 email – chrodriguez at petsmart.com<mailto:chrodriguez at petsmart.com> [PetSmart_logo_email.jpg] Upcoming Out-of-Office dates: June 26th through July 4th July 21st
From: Jeremy Laidman [mailto:jlaidman at rebel-it.com.au] Sent: Thursday, June 22, 2017 8:02 AM To: Chad Rodriguez <CHrodriguez at PetSmart.com> Cc: xymon at xymon.com Subject: Re: [Xymon] questions
Chad
What does the director do? How does it communicate with the servers?
Does the director or the server create a log message when there's a problem? Xymon can detect and alarm on that.
Does the director connect to the servers on a specific TCP port? If that port is rejecting a connection, the Xymon server can test for that (every 5 minutes, but can be more often) and alarm on it.
When a server fails, does it stop listening on a particular TCP port? Or perhaps a process crashes and restarts, causing the connection to fail? Xymon can test for these and alarm when it detects a missing listening or established TCP socket, or a missing process.
It's also possible to write a script to have Xymon look at the process listing "ps" output, and look for a particular process's lifetime, and alert when it's less than 5 minutes.
One thing to note is that Xymon's probes and processes typically look for things every 5 minutes. Transient failures that come and go within a few seconds may not be detected using the standard probes and checks. However, the frequency of some of these probes can be increased to make it more likely to catch failures. But also, a custom script can be written to check the state as often as you need. However for transient faults, it's more reliable to look for artefacts of a failure (log errors and warnings, short process lifetime) rather than periodically checking for a successful state.
J
On 22 June 2017 at 22:59, Chad Rodriguez <CHrodriguez at petsmart.com<mailto:CHrodriguez at petsmart.com>> wrote: Symptom, we open up director and see application servers not communicating at the same time we can ping server by hostname and IP>
Respectfully,
Chad Rodriguez | Systems Administrator 19601 N. 27th Ave., Phoenix, AZ 85027 office: 623-587-2385 | fax: 623-580-6117 email – chrodriguez at petsmart.com<mailto:chrodriguez at petsmart.com> [PetSmart_logo_email.jpg] Upcoming Out-of-Office dates: June 26th through July 4th July 21st
From: Jeremy Laidman [mailto:jlaidman at rebel-it.com.au<mailto:jlaidman at rebel-it.com.au>] Sent: Thursday, June 22, 2017 3:23 AM To: Chad Rodriguez <CHrodriguez at PetSmart.com<mailto:CHrodriguez at PetSmart.com>> Cc: xymon at xymon.com<mailto:xymon at xymon.com> Subject: Re: [Xymon] questions
Chad
Situations like what exactly? When a server is rebooted? Or when a server stops communicating? Can you explain what symptoms? What is a "director"? Sorry, I'm not familiar with the Solarwinds product.
Out of the box, Xymon can detect a few different types of communication issues (eg ping checks, TCP port responses) as well as monitoring logfiles for messages that indicate trouble. Furthermore, Xymon is highly extensible, so if you can write a script to perform a test for your problem, you can turn it into a message for Xymon to display, and optionally alarm via email or other means.
Cheers Jeremy
On 22 June 2017 at 07:07, Chad Rodriguez <CHrodriguez at petsmart.com<mailto:CHrodriguez at petsmart.com>> wrote: We have no monitoring in place other than solarwinds which monitors heartbeats. Essentially we have a few application servers that are randomly not communicating with the director and were having to reboot them. Seeing if your application would alert on situations like this in an email notification format?
Respectfully,
Chad Rodriguez | Systems Administrator 19601 N. 27th Ave., Phoenix, AZ 85027 office: 623-587-2385 | fax: 623-580-6117 email – chrodriguez at petsmart.com<mailto:chrodriguez at petsmart.com> [PetSmart_logo_email.jpg] Upcoming Out-of-Office dates: June 26th through July 4th July 21st
Xymon mailing list Xymon at xymon.com<mailto:Xymon at xymon.com> http://lists.xymon.com/mailman/listinfo/xymon
participants (2)
-
CHrodriguez@PetSmart.com
-
jlaidman@rebel-it.com.au