sepparated disk alerts
Greetings
I don't know if this is the correct forum for this question or more like it, new feature request. How ever I'm gonna post here and hope someone will point me right if this is the wrong place.
For the issue at hand. At my company we use Xymon to monitor thousands of servers. And sometimes disks gets filled and thus generates an alert. Sometimes this won't get looked into for a couple of days since our clients have specified that they want to remove data themselves (and not buy more storage). And as several servers have 3 - 8 disk and/or partitions we sometimes have trouble monitoring the other disks since one is already sending an alert.
Example: Server1 has 3 monitored partitions of a disk: / /srv/important/data /srv/database
with the same limits on all three: 80% -> yellow alert; 90% -> red alert Then if /srv/important/data reaches 82% the client wants us to notify them and they will free up space. This normaly takes around 3 - 5 working days. But they also wants us to monitor /srv/database. And say that 1 day after /srv/important/data gets filled, /srv/database reaches 84%. That will not trigger a new alert in the non-green status view which is what we monitor.
The question/request is then as this: Is there a way to get the client to report each disk/partition as a separate alert, so we can disable the alert for one disk while receiving alerts for the other disks/partions. To use the example I want to be able to temporarily set /srv/important/data in disabled mode while still getting alerts from /srv/database
I know this could be solved by writing my own script for the client, but that was disapproved of from management as they want as few custom scripts to maintain as possible (we already have dozens of custom scripts).
All help and feedback is appreciated, thanks.
Kind regards Calle Lejdbrandt
This line of thinking can also be applied to services and processes - of which, I would like to be able to do.
If you monitor a server for serviceA, serviceB, and serviceC. If serviceC stops, but it is on purpose/known issue, and you acknowledge the alert, will the stopping of serviceA or B then also be ignored?
Thanks, John Upcoming PTO: None
John Rothlisberger IT Strategy, Infrastructure & Security - Technology Growth Platform TGP for Business Process Outsourcing Accenture 312.693.3136 office
-----Original Message----- From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of Aiquen Sent: Friday, February 15, 2013 2:15 PM To: xymon at xymon.com Subject: [Xymon] sepparated disk alerts
Greetings
I don't know if this is the correct forum for this question or more like it, new feature request. How ever I'm gonna post here and hope someone will point me right if this is the wrong place.
For the issue at hand. At my company we use Xymon to monitor thousands of servers. And sometimes disks gets filled and thus generates an alert. Sometimes this won't get looked into for a couple of days since our clients have specified that they want to remove data themselves (and not buy more storage). And as several servers have 3 - 8 disk and/or partitions we sometimes have trouble monitoring the other disks since one is already sending an alert.
Example: Server1 has 3 monitored partitions of a disk: / /srv/important/data /srv/database
with the same limits on all three: 80% -> yellow alert; 90% -> red alert Then if /srv/important/data reaches 82% the client wants us to notify them and they will free up space. This normaly takes around 3 - 5 working days. But they also wants us to monitor /srv/database. And say that 1 day after /srv/important/data gets filled, /srv/database reaches 84%. That will not trigger a new alert in the non-green status view which is what we monitor.
The question/request is then as this: Is there a way to get the client to report each disk/partition as a separate alert, so we can disable the alert for one disk while receiving alerts for the other disks/partions. To use the example I want to be able to temporarily set /srv/important/data in disabled mode while still getting alerts from /srv/database
I know this could be solved by writing my own script for the client, but that was disapproved of from management as they want as few custom scripts to maintain as possible (we already have dozens of custom scripts).
All help and feedback is appreciated, thanks.
Kind regards Calle Lejdbrandt
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited.
Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy.
www.accenture.com
I have a similar problem - systems with OS partitions, database partitions and application partitions. I'd like to be able to send alerts to the appropriate groups when they fill up. What I have tried so far is a specialized alert script that examines the alert message and pick outs the filesystems flagged yellow or red for sending emails to the correct responders. It's not ideal, but it works. It's direly in need of some kind of config file to map system:fileystem to email address.
Ralph Mitchell
On Fri, Feb 15, 2013 at 3:14 PM, Aiquen <aiqueneldar at gmail.com> wrote:
Greetings
I don't know if this is the correct forum for this question or more like it, new feature request. How ever I'm gonna post here and hope someone will point me right if this is the wrong place.
For the issue at hand. At my company we use Xymon to monitor thousands of servers. And sometimes disks gets filled and thus generates an alert. Sometimes this won't get looked into for a couple of days since our clients have specified that they want to remove data themselves (and not buy more storage). And as several servers have 3 - 8 disk and/or partitions we sometimes have trouble monitoring the other disks since one is already sending an alert.
Example: Server1 has 3 monitored partitions of a disk: / /srv/important/data /srv/database
with the same limits on all three: 80% -> yellow alert; 90% -> red alert Then if /srv/important/data reaches 82% the client wants us to notify them and they will free up space. This normaly takes around 3 - 5 working days. But they also wants us to monitor /srv/database. And say that 1 day after /srv/important/data gets filled, /srv/database reaches 84%. That will not trigger a new alert in the non-green status view which is what we monitor.
The question/request is then as this: Is there a way to get the client to report each disk/partition as a separate alert, so we can disable the alert for one disk while receiving alerts for the other disks/partions. To use the example I want to be able to temporarily set /srv/important/data in disabled mode while still getting alerts from /srv/database
I know this could be solved by writing my own script for the client, but that was disapproved of from management as they want as few custom scripts to maintain as possible (we already have dozens of custom scripts).
All help and feedback is appreciated, thanks.
Kind regards Calle Lejdbrandt
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
On Fri, Feb 15, 2013 at 3:14 PM, Aiquen <aiqueneldar at gmail.com> wrote:
Greetings
I don't know if this is the correct forum for this question or more like it, new feature request. How ever I'm gonna post here and hope someone will point me right if this is the wrong place.
For the issue at hand. At my company we use Xymon to monitor thousands of servers. And sometimes disks gets filled and thus generates an alert. Sometimes this won't get looked into for a couple of days since our clients have specified that they want to remove data themselves (and not buy more storage). And as several servers have 3 - 8 disk and/or partitions we sometimes have trouble monitoring the other disks since one is already sending an alert.
Example: Server1 has 3 monitored partitions of a disk: / /srv/important/data /srv/database
with the same limits on all three: 80% -> yellow alert; 90% -> red alert Then if /srv/important/data reaches 82% the client wants us to notify them and they will free up space. This normaly takes around 3 - 5 working days. But they also wants us to monitor /srv/database. And say that 1 day after /srv/important/data gets filled, /srv/database reaches 84%. That will not trigger a new alert in the non-green status view which is what we monitor.
The question/request is then as this: Is there a way to get the client to report each disk/partition as a separate alert, so we can disable the alert for one disk while receiving alerts for the other disks/partions. To use the example I want to be able to temporarily set /srv/important/data in disabled mode while still getting alerts from /srv/database
analysis.cfg
HOST=myhost DISK /srv/database GROUP=A DISK /srv/important/data GROUP=B
alerts.cfg
GROUP=A COLOR=red MAIL groupA
GROUP=B COLOR=red MAIL groupB
This might be start.
I know this could be solved by writing my own script for the client, but that was disapproved of from management as they want as few custom scripts to maintain as possible (we already have dozens of custom scripts).
All help and feedback is appreciated, thanks.
Kind regards Calle Lejdbrandt
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
Hi Thanks for the suggestion. Oyvind suggestion seems close to what I need. I only need to find a way to make it keep track of which nodes have their limits raised. I am not allowed to make changes to the levels of the alerts without making sure that the levels get droped down again after a certain time. This is laid down as a rule from higher ups in my company.
What Asif and Ralph suggests is also good, but would breach the rule about "NO mails may be automatically sent from xymon anywhere" that is laid down on me.
Thank you for good suggestions and sorry about saying that they don't work for me. I want to point out that these will probably technically work to solve the problem. But I am not allowed to implement them because of the set of rules I have to follow that I mentioned in my first post. That is why I called this more of a new feature request than an actual issue. Pritty much the only acceptable solution for the higher ups is that a new client is released with the feature to sepparate disk alerts based on monitored disks. And then be able do disable or raise the limit within a time boundury, since we have to be able to garantee that the alert will come back and remind us if no one lookes in to it for a certain amount of time. Maby it is worth to mention that the feature "disable until ok" is disabled from our xymon. And you cannot disable an alert for more than 28 days to make sure that no alert ever gets neglected.
Again, thank you for your time. All suggestions is apprisiated. I will try to work on something along the lines of Oyvinds suggestion.
Kind Regards Calle Lejdbrandt
On Tue, Feb 19, 2013 at 5:02 AM, Asif Iqbal <vadud3 at gmail.com> wrote:
On Fri, Feb 15, 2013 at 3:14 PM, Aiquen <aiqueneldar at gmail.com> wrote:
Greetings
I don't know if this is the correct forum for this question or more like it, new feature request. How ever I'm gonna post here and hope someone will point me right if this is the wrong place.
For the issue at hand. At my company we use Xymon to monitor thousands of servers. And sometimes disks gets filled and thus generates an alert. Sometimes this won't get looked into for a couple of days since our clients have specified that they want to remove data themselves (and not buy more storage). And as several servers have 3 - 8 disk and/or partitions we sometimes have trouble monitoring the other disks since one is already sending an alert.
Example: Server1 has 3 monitored partitions of a disk: / /srv/important/data /srv/database
with the same limits on all three: 80% -> yellow alert; 90% -> red alert Then if /srv/important/data reaches 82% the client wants us to notify them and they will free up space. This normaly takes around 3 - 5 working days. But they also wants us to monitor /srv/database. And say that 1 day after /srv/important/data gets filled, /srv/database reaches 84%. That will not trigger a new alert in the non-green status view which is what we monitor.
The question/request is then as this: Is there a way to get the client to report each disk/partition as a separate alert, so we can disable the alert for one disk while receiving alerts for the other disks/partions. To use the example I want to be able to temporarily set /srv/important/data in disabled mode while still getting alerts from /srv/database
analysis.cfg
HOST=myhost DISK /srv/database GROUP=A DISK /srv/important/data GROUP=B
alerts.cfg
GROUP=A COLOR=red MAIL groupA
GROUP=B COLOR=red MAIL groupB
This might be start.
I know this could be solved by writing my own script for the client, but that was disapproved of from management as they want as few custom scripts to maintain as possible (we already have dozens of custom scripts).
All help and feedback is appreciated, thanks.
Kind regards Calle Lejdbrandt
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
On 20/02/13 15:00, Aiquen wrote:
Hi Thanks for the suggestion. Oyvind suggestion seems close to what I need. I only need to find a way to make it keep track of which nodes have their limits raised. I am not allowed to make changes to the levels of the alerts without making sure that the levels get droped down again after a certain time. This is laid down as a rule from higher ups in my company.
What Asif and Ralph suggests is also good, but would breach the rule about "NO mails may be automatically sent from xymon anywhere" that is laid down on me.
Thank you for good suggestions and sorry about saying that they don't work for me. I want to point out that these will probably technically work to solve the problem. But I am not allowed to implement them because of the set of rules I have to follow that I mentioned in my first post. That is why I called this more of a new feature request than an actual issue. Pritty much the only acceptable solution for the higher ups is that a new client is released with the feature to sepparate disk alerts based on monitored disks. And then be able do disable or raise the limit within a time boundury, since we have to be able to garantee that the alert will come back and remind us if no one lookes in to it for a certain amount of time. Maby it is worth to mention that the feature "disable until ok" is disabled from our xymon. And you cannot disable an alert for more than 28 days to make sure that no alert ever gets neglected.
Again, thank you for your time. All suggestions is apprisiated. I will try to work on something along the lines of Oyvinds suggestion.
Perhaps the feature request could be along the following lines...
It is currently possible to add a &<color> to a status message, generally this is done at the beginning of some of the lines to indicate the status of that component, this can be seen in the procs column for example.
It would be helpful to be able to disable the individual line instead of the entire procs status, something like: somehost.procs.crond
Where the value "crond" is the first "word" after the &red. This would allow the column procs to have a overall status of green (or blue), until the disabled expires, or another line of the procs changes to red.
In the meantime, you could implement this on the hobbit server by listening to the client reports being sent in, and parsing/processing the disk column as required, and then setting the color for that column as needed.
This is probably a pretty involved job, especially to generalise it to the point it can be useful for any column, and for more than one reason, but it definitely could add a lot of value. procs, disk, ports are just a few columns that are very overloaded (ie, one column but relate to many services/meanings), it is not helpful to have 100 columns per host, but it is helpful to be able to control enable/disable, alerts, and similar based on these columns.
Just my thoughts on this, perhaps it will give someone inclined to write some code some ideas... At the end of the day, I'd suggest the only way to get this feature added (unless Henrik wants it himself) is to write the code, and submit it. That way it is easy to add the code (as long as it doesn't change things in an incompatible way).
Regards, Adam
-- Adam Goryachev Website Managers www.websitemanagers.com.au
Hi, it's possible to use report.sh for specific group of hosts, and specific test ? i would like to use it similary to the hold bb script sla-report.sh Marco
participants (6)
-
aiqueneldar@gmail.com
-
john.r.rothlisberger@accenture.com
-
mailinglists@websitemanagers.com.au
-
marco.avvisano@regione.toscana.it
-
ralphmitchell@gmail.com
-
vadud3@gmail.com