hey all, i have just built a new xymon box and am wondering if there is a good white doc out there on setting up devmon?
-Gavin
Gavin Leonard
[cid:image001.gif at 01CA620B.B88B4E20]
Director, Systems-Network Engineering
T
801-828-1735
F
801-828-1704
E
gleonard at progrexion.com<mailto:gleonard at progrexion.com>
Research | Marketing | Sales Generation
www.progrexion.com<http://www.progrexion.com/>
This email and its contents are confidential. If you are not the intended recipient, delete this email and do not use or disclose the information within this email or its attachments. Thank you.
I've been following this DEVMON guide, works fine for data collection:
http://en.wikibooks.org/wiki/System_Monitoring_with_Xymon/Other_Docs/HOWTO/D...
It doesn't explain how to generate graphs for SNMP data, but it is enough to get you started (I'm still stuck on generating interface graphs).
Kevin
If you are using the latest xymon, then setting up devmon graphs is a breeze. Follow the doc you get with devmon, but you won't need to do the rdd capture parts because those are already built into the xymon.
......Bruce
Bruce White Senior Enterprise Systems Engineer | Phone: 630-671-5169 | Fax: 630-893-1648 | bewhite at fellowes.com | http://www.fellowes.com/
Disclaimer: The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Fellowes, Inc.
-----Original Message----- From: K K [mailto:kkadow at gmail.com] Sent: Tuesday, November 10, 2009 4:15 PM To: hobbit at hswn.dk Subject: Re: [hobbit] Devmon setup
I've been following this DEVMON guide, works fine for data collection:
http://en.wikibooks.org/wiki/System_Monitoring_with_Xymon/Other_Docs/HOW TO/Devmon_SNMP
It doesn't explain how to generate graphs for SNMP data, but it is enough to get you started (I'm still stuck on generating interface graphs).
Kevin
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
On Tue, Nov 10, 2009 at 14:15, K K <kkadow at gmail.com> wrote:
I've been following this DEVMON guide, works fine for data collection:
http://en.wikibooks.org/wiki/System_Monitoring_with_Xymon/Other_Docs/HOWTO/D...
It doesn't explain how to generate graphs for SNMP data, but it is enough to get you started (I'm still stuck on generating interface graphs).
I've added some documentation for generating graphs, have at it. I spent a long time trawling through mailing list archives before realizing the presence of docs/GRAPHING (docs/TEMPLATES doesn't mention it), and even then there were a few pitfalls along the way.
On Tuesday, 10 November 2009 23:15:19 K K wrote:
I've been following this DEVMON guide, works fine for data collection:
http://en.wikibooks.org/wiki/System_Monitoring_with_Xymon/Other_Docs/HOWTO/ Devmon_SNMP
I don't like this guide. It makes things more difficult than they need to be, and it is outdated. I have now added a warning section covering why other sources of documentation should be considered.
I enabled a wiki on the Devmon SF site a while back, and added some basic information there.
http://sourceforge.net/apps/mediawiki/devmon
Feel free to improve it. (I think currently it requires you to log in with a sourceforge account, if there is enough interest to warrant the effort in avoiding spam and malicious edits, I will allow anonymous edits).
It doesn't explain how to generate graphs for SNMP data, but it is enough to get you started (I'm still stuck on generating interface graphs).
RRD collection should work out-the-box on Xymon 4.2.3. To get graphs for the RRD files, Just copy the extras/devmon-graph.cfg into your Xymon hobbit graph definitions (since I have 'directory /etc/xymon/hobbitgraph.d in my hobbitgraph.cfg, I just copy the file into /etc/xymon/hobbitgraph.d).
I will try and update the wiki accordingly.
Regards, Buchan
Hello
some time ago I already talked about devmon stops working when a monitored device ist not responding. Now I saw it has nothing to do with non responsive devices. Devmon stops working at irregular intervals. I set Devmon to verbose and looked at the devmon log. I saw that there are simply no more messages when it stops working (see below). No error messages - nothing. None in the devmon log nor in the syslog.
If I do a "ps -ef" I see all devmon processes running:
[root at s068a300 devmon]# ps -ef |grep devmon hobbit 10211 1 0 Nov09 ? 00:10:07 devmon[master] hobbit 10214 10211 0 Nov09 ? 00:00:22 devmon hobbit 10215 10211 0 Nov09 ? 00:00:21 devmon hobbit 10217 10211 0 Nov09 ? 00:00:22 devmon hobbit 10218 10211 0 Nov09 ? 00:01:52 devmon hobbit 10219 10211 0 Nov09 ? 00:00:21 devmon hobbit 10220 10211 0 Nov09 ? 00:01:51 devmon hobbit 10221 10211 0 Nov09 ? 00:01:52 devmon hobbit 10222 10211 0 Nov09 ? 00:00:00 devmon hobbit 10223 10211 0 Nov09 ? 00:00:00 devmon root 20447 3611 0 14:47 pts/1 00:00:00 grep devmon
Any idea how I can find out why devmon stops working and what the processes do when they are stuck. If I send a SIGTERM to the devmon master process, it stops all other processe, so it looks it is responding to signals as it should.
BTW.: has anyone a devmon startup/shutdown script which works on SuSE EL.
Thorsten Erdmann
Attachement: Here are the last few lines of the devmon log
[09-11-10 at 10:52:21] Performing test logic [09-11-10 at 10:52:21] Done with test logic [09-11-10 at 10:52:21] Sending messages to display server [09-11-10 at 10:52:21] Done sending messages [09-11-10 at 10:52:21] Sleeping for 59 seconds. [09-11-10 at 10:53:20] Starting snmp queries [09-11-10 at 10:53:20] Getting device status from hobbit at localhost:1984 [09-11-10 at 10:53:20] Querying u068usv020a1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020a2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020b1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020b2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv110111 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1111 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1112 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1211 for tests power,temperature [09-11-10 at 10:53:21] Performing test logic [09-11-10 at 10:53:21] Done with test logic [09-11-10 at 10:53:21] Sending messages to display server [09-11-10 at 10:53:21] Done sending messages [09-11-10 at 10:53:21] Sleeping for 59 seconds. [09-11-10 at 10:54:20] Starting snmp queries [09-11-10 at 10:54:20] Getting device status from hobbit at localhost:1984 [09-11-10 at 10:54:20] Querying u068usv020a1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020a2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020b1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020b2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv110111 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1111 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1112 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1211 for tests power,temperature
If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
I've got the same problem. Just had to restart after having it working for about 48 hours.
I have added devmon (0.3.1-beta1) to the mix only a few weeks ago and am running it on ubuntu (desktop 8.10) along with xymon 4.2.3 (running about 6 months). On a side note, the rrd graphing works quite well for connects, cpu, if_load, and memory.
to kill it I run "sudo killall devmon" and it goes from purple to green again without running anything else.
To get devmon running in the first place I've added the following to hobbitlaunch.cfg: (I'm not sure this is the "proper" way to handle and almost seems to too easy but it starts when I start xymon.)
hobbitlaunch.cfg ... [devmon] CMD $BBHOME/ext/devmon/devmon
[devmonreload] CMD $BBHOME/ext/devmon/devmon --readbbhosts INTERVAL 5m ... I've seen others post that they have cron jobs daily or even more often to restart devmon but I wish that wasn't required.
Greg
From: thorsten.erdmann at daimler.com [mailto:thorsten.erdmann at daimler.com] Sent: Wednesday, November 11, 2009 8:58 AM To: hobbit at hswn.dk Subject: [hobbit] DEVMON stops working every now and then
Hello
some time ago I already talked about devmon stops working when a monitored device ist not responding. Now I saw it has nothing to do with non responsive devices. Devmon stops working at irregular intervals. I set Devmon to verbose and looked at the devmon log. I saw that there are simply no more messages when it stops working (see below). No error messages - nothing. None in the devmon log nor in the syslog.
If I do a "ps -ef" I see all devmon processes running:
[root at s068a300 devmon]# ps -ef |grep devmon hobbit 10211 1 0 Nov09 ? 00:10:07 devmon[master] hobbit 10214 10211 0 Nov09 ? 00:00:22 devmon hobbit 10215 10211 0 Nov09 ? 00:00:21 devmon hobbit 10217 10211 0 Nov09 ? 00:00:22 devmon hobbit 10218 10211 0 Nov09 ? 00:01:52 devmon hobbit 10219 10211 0 Nov09 ? 00:00:21 devmon hobbit 10220 10211 0 Nov09 ? 00:01:51 devmon hobbit 10221 10211 0 Nov09 ? 00:01:52 devmon hobbit 10222 10211 0 Nov09 ? 00:00:00 devmon hobbit 10223 10211 0 Nov09 ? 00:00:00 devmon root 20447 3611 0 14:47 pts/1 00:00:00 grep devmon
Any idea how I can find out why devmon stops working and what the processes do when they are stuck. If I send a SIGTERM to the devmon master process, it stops all other processe, so it looks it is responding to signals as it should.
BTW.: has anyone a devmon startup/shutdown script which works on SuSE EL.
Thorsten Erdmann
Attachement: Here are the last few lines of the devmon log
[09-11-10 at 10:52:21] Performing test logic [09-11-10 at 10:52:21] Done with test logic [09-11-10 at 10:52:21] Sending messages to display server [09-11-10 at 10:52:21] Done sending messages [09-11-10 at 10:52:21] Sleeping for 59 seconds. [09-11-10 at 10:53:20] Starting snmp queries [09-11-10 at 10:53:20] Getting device status from hobbit at localhost:1984 [09-11-10 at 10:53:20] Querying u068usv020a1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020a2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020b1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020b2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv110111 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1111 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1112 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1211 for tests power,temperature [09-11-10 at 10:53:21] Performing test logic [09-11-10 at 10:53:21] Done with test logic [09-11-10 at 10:53:21] Sending messages to display server [09-11-10 at 10:53:21] Done sending messages [09-11-10 at 10:53:21] Sleeping for 59 seconds. [09-11-10 at 10:54:20] Starting snmp queries [09-11-10 at 10:54:20] Getting device status from hobbit at localhost:1984 [09-11-10 at 10:54:20] Querying u068usv020a1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020a2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020b1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020b2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv110111 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1111 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1112 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1211 for tests power,temperature If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
I'm assuming when DEVMON quits working everything goes purple, if this is the case .... someone posted this awhile back as a work around.. It works PERFECTLY and I haven't had to touch DEVMON since.
hobbit-alerts.cfg
!-- RESTART DEVMON on PURPLE
HOST=NOC COLOR=purple SERVICE=dm
SCRIPT /usr/local/hobbit/server/ext/restart-devmon.sh 1234567890
<<
restart-devmon.sh
#!/bin/sh
Custom Script to Restart DEVMON on Purple
ps -ax | grep devm | grep perl|awk '{print $1}' | xargs kill
sleep 60
ps -ax | grep devm | grep perl|awk '{print $1}' | xargs kill
sleep 10
/usr/local/hobbit-devmon/devmon
<<
-Clint
From: Gregory Thomas [mailto:GThomas at fairdinkum.com] Sent: 11. novembra 2009 10:24 To: 'hobbit at hswn.dk' Subject: RE: [hobbit] DEVMON stops working every now and then
I've got the same problem. Just had to restart after having it working for about 48 hours.
I have added devmon (0.3.1-beta1) to the mix only a few weeks ago and am running it on ubuntu (desktop 8.10) along with xymon 4.2.3 (running about 6 months). On a side note, the rrd graphing works quite well for connects, cpu, if_load, and memory.
to kill it I run "sudo killall devmon" and it goes from purple to green again without running anything else.
To get devmon running in the first place I've added the following to hobbitlaunch.cfg: (I'm not sure this is the "proper" way to handle and almost seems to too easy but it starts when I start xymon.)
hobbitlaunch.cfg
...
[devmon] CMD $BBHOME/ext/devmon/devmon
[devmonreload] CMD $BBHOME/ext/devmon/devmon --readbbhosts INTERVAL 5m
...
I've seen others post that they have cron jobs daily or even more often to restart devmon but I wish that wasn't required.
Greg
From: thorsten.erdmann at daimler.com [mailto:thorsten.erdmann at daimler.com]
Sent: Wednesday, November 11, 2009 8:58 AM To: hobbit at hswn.dk Subject: [hobbit] DEVMON stops working every now and then
Hello
some time ago I already talked about devmon stops working when a monitored device ist not responding. Now I saw it has nothing to do with non responsive devices. Devmon stops working at irregular intervals. I set Devmon to verbose and looked at the devmon log. I saw that there are simply no more messages when it stops working (see below). No error messages - nothing. None in the devmon log nor in the syslog.
If I do a "ps -ef" I see all devmon processes running:
[root at s068a300 devmon]# ps -ef |grep devmon hobbit 10211 1 0 Nov09 ? 00:10:07 devmon[master] hobbit 10214 10211 0 Nov09 ? 00:00:22 devmon hobbit 10215 10211 0 Nov09 ? 00:00:21 devmon hobbit 10217 10211 0 Nov09 ? 00:00:22 devmon hobbit 10218 10211 0 Nov09 ? 00:01:52 devmon hobbit 10219 10211 0 Nov09 ? 00:00:21 devmon hobbit 10220 10211 0 Nov09 ? 00:01:51 devmon hobbit 10221 10211 0 Nov09 ? 00:01:52 devmon hobbit 10222 10211 0 Nov09 ? 00:00:00 devmon hobbit 10223 10211 0 Nov09 ? 00:00:00 devmon root 20447 3611 0 14:47 pts/1 00:00:00 grep devmon
Any idea how I can find out why devmon stops working and what the processes do when they are stuck. If I send a SIGTERM to the devmon master process, it stops all other processe, so it looks it is responding to signals as it should.
BTW.: has anyone a devmon startup/shutdown script which works on SuSE EL.
Thorsten Erdmann
Attachement: Here are the last few lines of the devmon log
[09-11-10 at 10:52:21] Performing test logic [09-11-10 at 10:52:21] Done with test logic [09-11-10 at 10:52:21] Sending messages to display server [09-11-10 at 10:52:21] Done sending messages [09-11-10 at 10:52:21] Sleeping for 59 seconds. [09-11-10 at 10:53:20] Starting snmp queries [09-11-10 at 10:53:20] Getting device status from hobbit at localhost:1984 [09-11-10 at 10:53:20] Querying u068usv020a1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020a2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020b1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020b2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv110111 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1111 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1112 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1211 for tests power,temperature [09-11-10 at 10:53:21] Performing test logic [09-11-10 at 10:53:21] Done with test logic [09-11-10 at 10:53:21] Sending messages to display server [09-11-10 at 10:53:21] Done sending messages [09-11-10 at 10:53:21] Sleeping for 59 seconds. [09-11-10 at 10:54:20] Starting snmp queries [09-11-10 at 10:54:20] Getting device status from hobbit at localhost:1984 [09-11-10 at 10:54:20] Querying u068usv020a1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020a2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020b1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020b2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv110111 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1111 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1112 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1211 for tests power,temperature If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
We have the same problem - I've even got devmon configured under SMF in Solaris however it doesn't pick up the fact its crashed as the process is still there.
A quick and dirty workaround we have is to send an alert on the "dm" monitor going purple - this allows the on-call engineer to be alerted to the fact we are no longer effectively monitoring the network devices and so to restart the process!
There must be a better way though...
---- Gregory Thomas <GThomas at fairdinkum.com> wrote:
I've got the same problem. Just had to restart after having it working for about 48 hours.
I have added devmon (0.3.1-beta1) to the mix only a few weeks ago and am running it on ubuntu (desktop 8.10) along with xymon 4.2.3 (running about 6 months). On a side note, the rrd graphing works quite well for connects, cpu, if_load, and memory.
to kill it I run "sudo killall devmon" and it goes from purple to green again without running anything else.
To get devmon running in the first place I've added the following to hobbitlaunch.cfg: (I'm not sure this is the "proper" way to handle and almost seems to too easy but it starts when I start xymon.)
hobbitlaunch.cfg ... [devmon] CMD $BBHOME/ext/devmon/devmon
[devmonreload] CMD $BBHOME/ext/devmon/devmon --readbbhosts INTERVAL 5m ... I've seen others post that they have cron jobs daily or even more often to restart devmon but I wish that wasn't required.
Greg
From: thorsten.erdmann at daimler.com [mailto:thorsten.erdmann at daimler.com] Sent: Wednesday, November 11, 2009 8:58 AM To: hobbit at hswn.dk Subject: [hobbit] DEVMON stops working every now and then
Hello
some time ago I already talked about devmon stops working when a monitored device ist not responding. Now I saw it has nothing to do with non responsive devices. Devmon stops working at irregular intervals. I set Devmon to verbose and looked at the devmon log. I saw that there are simply no more messages when it stops working (see below). No error messages - nothing. None in the devmon log nor in the syslog.
If I do a "ps -ef" I see all devmon processes running:
[root at s068a300 devmon]# ps -ef |grep devmon hobbit 10211 1 0 Nov09 ? 00:10:07 devmon[master] hobbit 10214 10211 0 Nov09 ? 00:00:22 devmon hobbit 10215 10211 0 Nov09 ? 00:00:21 devmon hobbit 10217 10211 0 Nov09 ? 00:00:22 devmon hobbit 10218 10211 0 Nov09 ? 00:01:52 devmon hobbit 10219 10211 0 Nov09 ? 00:00:21 devmon hobbit 10220 10211 0 Nov09 ? 00:01:51 devmon hobbit 10221 10211 0 Nov09 ? 00:01:52 devmon hobbit 10222 10211 0 Nov09 ? 00:00:00 devmon hobbit 10223 10211 0 Nov09 ? 00:00:00 devmon root 20447 3611 0 14:47 pts/1 00:00:00 grep devmon
Any idea how I can find out why devmon stops working and what the processes do when they are stuck. If I send a SIGTERM to the devmon master process, it stops all other processe, so it looks it is responding to signals as it should.
BTW.: has anyone a devmon startup/shutdown script which works on SuSE EL.
Thorsten Erdmann
Attachement: Here are the last few lines of the devmon log
[09-11-10 at 10:52:21] Performing test logic [09-11-10 at 10:52:21] Done with test logic [09-11-10 at 10:52:21] Sending messages to display server [09-11-10 at 10:52:21] Done sending messages [09-11-10 at 10:52:21] Sleeping for 59 seconds. [09-11-10 at 10:53:20] Starting snmp queries [09-11-10 at 10:53:20] Getting device status from hobbit at localhost:1984 [09-11-10 at 10:53:20] Querying u068usv020a1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020a2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020b1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv020b2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:53:20] Querying u068usv110111 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1111 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1112 for tests power,temperature [09-11-10 at 10:53:20] Querying u068usvnw1211 for tests power,temperature [09-11-10 at 10:53:21] Performing test logic [09-11-10 at 10:53:21] Done with test logic [09-11-10 at 10:53:21] Sending messages to display server [09-11-10 at 10:53:21] Done sending messages [09-11-10 at 10:53:21] Sleeping for 59 seconds. [09-11-10 at 10:54:20] Starting snmp queries [09-11-10 at 10:54:20] Getting device status from hobbit at localhost:1984 [09-11-10 at 10:54:20] Querying u068usv020a1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020a2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020b1 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv020b2 for tests battery,powerin,power,diag,temperature,msgs [09-11-10 at 10:54:21] Querying u068usv110111 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1111 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1112 for tests power,temperature [09-11-10 at 10:54:21] Querying u068usvnw1211 for tests power,temperature If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
On Wednesday, 11 November 2009 22:37:56 j.sansford at ntlworld.com wrote:
We have the same problem - I've even got devmon configured under SMF in Solaris however it doesn't pick up the fact its crashed as the process is still there.
It doesn't crash. As far as I can tell, eventually all the child processes lose communication with the master process, but they are all still running, just waiting for someone to tell them to do something.
A quick and dirty workaround we have is to send an alert on the "dm" monitor going purple - this allows the on-call engineer to be alerted to the fact we are no longer effectively monitoring the network devices and so to restart the process!
There must be a better way though...
Devmon has had "goes purple" problems since 0.2.2 beta. I fixed the more frequent one before the 0.3.0 release.
Anyway, I've done some work on this, however the only production instance of devmon I look at often at present last went purple 9 days ago ...
If you are reproducing more frequently, please have a look at the devmon-devel mailing list (or archives[1] once they have updated), I just sent a mail with an attached patch (against svn, it may apply to the 0.3.1-beta1, haven't tried) that may fix the problem, allow us to narrow it down further, or at least eliminate one aspect as the cause.
Regards, Buchan
On Wednesday, 11 November 2009 17:24:23 Gregory Thomas wrote:
I've got the same problem. Just had to restart after having it working for about 48 hours.
I have added devmon (0.3.1-beta1) to the mix only a few weeks ago and am running it on ubuntu (desktop 8.10) along with xymon 4.2.3 (running about 6 months). On a side note, the rrd graphing works quite well for connects, cpu, if_load, and memory.
to kill it I run "sudo killall devmon" and it goes from purple to green again without running anything else.
To get devmon running in the first place I've added the following to hobbitlaunch.cfg: (I'm not sure this is the "proper" way to handle and almost seems to too easy but it starts when I start xymon.)
Since devmon doesn't actually require a Xymon installation, the usual way to start is is from an init script. An example is supplied in the 'extras' directory. This is covered in item 6 in the docs/INSTALLATION file.
hobbitlaunch.cfg ... [devmon] CMD $BBHOME/ext/devmon/devmon
[devmonreload] CMD $BBHOME/ext/devmon/devmon --readbbhosts INTERVAL 5m ... I've seen others post that they have cron jobs daily or even more often to restart devmon but I wish that wasn't required.
Don't we all. I fixed the more frequent cause of the 'devmon goes purple' problem before the release of 0.3.0 final, but this one is a bit more difficult to reproduce, and thus troubleshoot. So far it looks like the sockets used for communication between the master and the children stop responding.
Again, this really belongs on the devmon mailing list, or in a bug report.
Regards, Buchan
Hello
is there a windows bb equivalent which supports the complete commandset which is supported under Linux. I need to do a query for all non green hosts, also those that are filtered from the "all non green" view. So I think
bb 127.0.0.1 "hobbitdboard color=yellow fields=hostname,testname,lastchange,color,msg" bb 127.0.0.1 "hobbitdboard color=red fields=hostname,testname,lastchange,color,msg"
is the thing I need. But bbwincmd does not support this.
Thorsten Erdmann
If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
In <OF613B132B.C0CF3EE1-ONC125766C.003ACF8C-C125766C.003B176D at dcx.dcx> thorsten.erdmann at daimler.com writes:
is there a windows bb equivalent which supports the complete commandset which is supported under Linux.
The easiest way of getting this is to build the Unix Xymon tools with Cygwin. You can grab such a tool from http://www.hswn.dk/~henrik/bbutility.zip
Regards, Henrik
Hi Henrik
thanks, that is what I am looking for. Works nice.
Thorsten Erdmann
henrik at hswn.dk 13.11.2009 12:06 Bitte antworten an hobbit at hswn.dk
An hobbit at hswn.dk Kopie
Thema Re: [hobbit] bbwincmd with hobbitdboard support
In <OF613B132B.C0CF3EE1-ONC125766C.003ACF8C-C125766C.003B176D at dcx.dcx> thorsten.erdmann at daimler.com writes:
is there a windows bb equivalent which supports the complete commandset which is supported under Linux.
The easiest way of getting this is to build the Unix Xymon tools with Cygwin. You can grab such a tool from http://www.hswn.dk/~henrik/bbutility.zip
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
If you are not the intended addressee, please inform us immediately that you have received this e-mail in error, and delete it. We thank you for your cooperation.
Thanks Buchan.
To close this thread off here and give others a chance to follow if this thread turns up in a search later, a bug has been created on the sourceforge project page.
-----Original Message----- From: Buchan Milne [mailto:bgmilne at staff.telkomsa.net] Sent: Thursday, November 12, 2009 3:06 AM To: hobbit at hswn.dk Cc: Gregory Thomas Subject: Re: [hobbit] DEVMON stops working every now and then
On Wednesday, 11 November 2009 17:24:23 Gregory Thomas wrote:
I've got the same problem. Just had to restart after having it working for about 48 hours.
I have added devmon (0.3.1-beta1) to the mix only a few weeks ago and am running it on ubuntu (desktop 8.10) along with xymon 4.2.3 (running about 6 months). On a side note, the rrd graphing works quite well for connects, cpu, if_load, and memory.
to kill it I run "sudo killall devmon" and it goes from purple to green again without running anything else.
To get devmon running in the first place I've added the following to hobbitlaunch.cfg: (I'm not sure this is the "proper" way to handle and almost seems to too easy but it starts when I start xymon.)
Since devmon doesn't actually require a Xymon installation, the usual way to start is is from an init script. An example is supplied in the 'extras' directory. This is covered in item 6 in the docs/INSTALLATION file.
hobbitlaunch.cfg ... [devmon] CMD $BBHOME/ext/devmon/devmon
[devmonreload] CMD $BBHOME/ext/devmon/devmon --readbbhosts INTERVAL 5m ... I've seen others post that they have cron jobs daily or even more often to restart devmon but I wish that wasn't required.
Don't we all. I fixed the more frequent cause of the 'devmon goes purple' problem before the release of 0.3.0 final, but this one is a bit more difficult to reproduce, and thus troubleshoot. So far it looks like the sockets used for communication between the master and the children stop responding.
Again, this really belongs on the devmon mailing list, or in a bug report.
Regards, Buchan
On Wednesday, 11 November 2009 14:57:47 thorsten.erdmann at daimler.com wrote:
Hello
some time ago I already talked about devmon stops working when a monitored device ist not responding. Now I saw it has nothing to do with non responsive devices. Devmon stops working at irregular intervals. I set Devmon to verbose and looked at the devmon log. I saw that there are simply no more messages when it stops working (see below). No error messages - nothing. None in the devmon log nor in the syslog.
Please discuss this on the devmon mailing list, as this has absolutely nothing to do with Xymon/Devmon.
This is a known issue, which we (David Balwdwin and I) are currently investigating actively. However, no-one has bothered to file a bug, so this investigation is occurring off-list. If an affected user filed a bug, we would use the bug for the investigation, and any interested users could subscribe to the bug to follow it.
Regards, Buchan
participants (10)
-
bewhite@fellowes.com
-
bgmilne@staff.telkomsa.net
-
csimmons@ApproSystems.com
-
gleonard@progrexion.com
-
goldfndr@gmail.com
-
GThomas@fairdinkum.com
-
henrik@hswn.dk
-
j.sansford@ntlworld.com
-
kkadow@gmail.com
-
thorsten.erdmann@daimler.com