Hi,
I have used old good bb-prtdiag for many years. It works fairly well for a lot of old Sun machines but lack the new ones. The script is pretty nasty in nested loops and too many if's. It is not easy to update it to accommodate new machines.
Is there anyone out there that has a new version of it? Can't find any on xymonton. Or do you monitor SUN hw in another way than prtdiag -> xymonserver?
Otherwise I will rewrite it to support our new hw.
- Roland
I cobbled something together at $PREVIOUS_CONTRACT, and I might be able to get it later this evening. I will have a look. It's really simple these days, because of the Solaris picld. (man picld for more info)
In it's simplest form, the test runs prtdiag, and thanks to how picld works with prtdiag, you can derive a status from $?
Simple, but effective. Xymon will tell you there is a hardware problem, but cannot tell you exactly what the problem is. That's best done using a human brain version 1.00 or better.
Regards Vernon
On 11 October 2011 10:12, Roland Soderstrom <rolands at logicaltech.com.au>wrote:
Hi,
I have used old good bb-prtdiag for many years. It works fairly well for a lot of old Sun machines but lack the new ones. The script is pretty nasty in nested loops and too many if's. It is not easy to update it to accommodate new machines.
Is there anyone out there that has a new version of it? Can't find any on xymonton. Or do you monitor SUN hw in another way than prtdiag -> xymonserver?
Otherwise I will rewrite it to support our new hw.
- Roland ______________________________**_________________ Xymon mailing list Xymon at xymon.com http://lists.xymon.com/**mailman/listinfo/xymon<http://lists.xymon.com/mailman/listinfo/xymon>
Hello,
Here a very simple script that runs pretty good:
#!/bin/ksh
COLUMN="prtdiag" # Name of the column, often same as tag in bb-hosts
ERROR=0
OK_COLOR="green" KO_COLOR="red"
OK_MSG="<PRE>&${OK_COLOR} All is OK Boss." KO_MSG="<PRE>&${KO_COLOR} Hardware Issue on the machine"
TMP=$BBTMP/prtdiag.tmp if [ -z "${TMP}" ] then TMP=/tmp/prtdiag.tmp fi
##############
Custom test
#############
PLATFORM=uname -i
/usr/platform/${PLATFORM}/sbin/prtdiag -v > $TMP 2>&1 CODE_RETOUR=$?
if [ $CODE_RETOUR -ne 0 ] then ERROR=1
fi
##############
Sending the message
#############
if [ $ERROR -eq 0 ]
then
COLOR=$OK_COLOR
MSG="$OK_MSG
cat $TMP
</PRE>"
else
COLOR=$KO_COLOR
MSG="$KO_MSG
cat $TMP
</PRE>"
fi
BB env var for the bb command
BBDISP env var for the XYMON srv
MACHINE env var for the XYMON client
$BB $BBDISP "status $MACHINE.$COLUMN $COLOR date $MSG"
exit 0
Cordialement, Regards,Mit freundlichen Grüßen,
Gautier BEGIN
Admin and Tools Team CSC Computer Sciences Luxembourg S.A. 12D Impasse Drosbach L-1882 Luxembourg
Global Outsourcing Service | p:+352 24 834 276 | m:+352 621 229 172 | gbegin at csc.com | www.csc.com
CSC • This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind CSC to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose • CSC Computer Sciences SAS • Registered Office: Immeuble Le Balzac, 10 Place des Vosges, 92072 Paris La Défense Cedex, France • Registered in France: RCS Nanterre B 315 268 664
From: Roland Soderstrom <rolands at logicaltech.com.au> To: <xymon at xymon.com> Date: 10/11/2011 04:11 AM Subject: [Xymon] Solaris prtdiag
Hi,
I have used old good bb-prtdiag for many years. It works fairly well for a lot of old Sun machines but lack the new ones. The script is pretty nasty in nested loops and too many if's. It is not easy to update it to accommodate new machines.
Is there anyone out there that has a new version of it? Can't find any on xymonton. Or do you monitor SUN hw in another way than prtdiag -> xymonserver?
Otherwise I will rewrite it to support our new hw.
- Roland
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Hi Roland
This is what I am using. I made a mistake though, I did not write it. I must have got confused with another script I wrote. (I confuse easily) Credit where due though, and credit goes to Wim Olivier (See comments) It also looks like it does pretty much the same as Gautier's script.
Only issue you may have, there is a bug in the picl daemon, and it sometime crashes. (Recent patching may have fixed this) When that happens, all bets are off, and results could vary. Restarting picl service normally fixes everything. Might want to add /system/picl to the list of monitored services. There are two good SMF monitoring scripts on the Xymonton page. http://www.xymonton.org/monitors:smf.sh and http://www.xymonton.org/monitors:smf2.ksh I prefer the 2nd one, obviously :-) but the first one is simpler to configure. The choice is yours.
Regards Vernon
#!/usr/bin/ksh
PURPOSE
This is an very simple script/test, but extremely useful.
It detects ANY hardware failure in Sun SPARC systems by using the standard
'prtdiag' command. Prtdiag exits with a value of
'1' if there is an hardware error, otherwise with an exit code of '0'.
This way, no model/platform specific (i.e. V240/V890/15K, etc.)
customizations are required, as the output of 'prtdiag'
differs on most systems. It works fine on Fujitsu-Siemens systems too.
Provided by: Wim Olivier, Senior Solaris/VERITAS Engineer, AL Indigo,
Johannesburg, South Africa
sunhw
INSTALLATION
1. Copy the script to $BBHOME/ext/sunhw.sh
2. Add it to the $BBHOME/etc/bb-bbexttab file
3. Restart the BB client
BBPROG=sunhw; export BBPROG TEMPFILE=/$BBTMP/sunwh.OUTPUT.$$ TEST="sunhw" COLOR="green"
if [ "$BBHOME" = "" ] then echo "BBHOME is not set... exiting" exit 1 fi
if [ ! "$BBTMP" ] # GET DEFINITIONS IF NEEDED then # echo "*** LOADING BBDEF ***" . $BBHOME/etc/bbdef.sh # INCLUDE STANDARD DEFINITIONS fi PANIC="1" # GO RED AND PAGE AT THIS LEVEL
PLATFORM=uname -i
/usr/platform/$PLATFORM/sbin/prtdiag -v > $TEMPFILE
RESULT=$?
#echo $RESULT
#
# DETERMINE RED/YELLOW/GREEN
#
if [ "$RESULT" -ne 0 ]
then
COLOR="red"
fi
AT THIS POINT WE HAVE OUR RESULTS. NOW WE HAVE TO SEND IT TO
THE BBDISPLAY TO BE DISPLAYED...
MACHINE=uname -n
THE FIRST LINE IS STATUS INFORMATION... STRUCTURE IMPORANT!
THE REST IS FREE-FORM - WHATEVER YOU'D LIKE TO SEND...
LINE="status $MACHINE.$TEST $COLOR date
cat $TEMPFILE"
$RM -f $TEMPFILE
NOW USE THE BB COMMAND TO SEND THE DATA ACROSS
$BB $BBDISP "$LINE" # SEND IT TO BBDISPLAY
On 11 October 2011 10:12, Roland Soderstrom <rolands at logicaltech.com.au>wrote:
Hi,
I have used old good bb-prtdiag for many years. It works fairly well for a lot of old Sun machines but lack the new ones. The script is pretty nasty in nested loops and too many if's. It is not easy to update it to accommodate new machines.
Is there anyone out there that has a new version of it? Can't find any on xymonton. Or do you monitor SUN hw in another way than prtdiag -> xymonserver?
Otherwise I will rewrite it to support our new hw.
- Roland ______________________________**_________________ Xymon mailing list Xymon at xymon.com http://lists.xymon.com/**mailman/listinfo/xymon<http://lists.xymon.com/mailman/listinfo/xymon>
An HTML attachment was scrubbed... URL: <http://lists.xymon.com/pipermail/xymon/attachments/20111018/f37daaff/attachment.html>
Why not add it to the Xymonton script repository?
Regards Vernon
On 18 October 2011 04:48, Roland Soderstrom <rolands at logicaltech.com.au>wrote:
Finally I got the time to implement it, Thanks Vernon, it works like a charm.
- Roland
I changed it for XYMON only, if someone is interested... That is changing BBHOME to XYMONHOME etc etc.
#!/usr/bin/ksh
PURPOSE
This is an very simple script/test, but extremely useful.
It detects ANY hardware failure in Sun SPARC systems by using the
standard 'prtdiag' command. Prtdiag exits with a value of
'1' if there is an hardware error, otherwise with an exit code of '0'.
This way, no model/platform specific (i.e. V240/V890/15K, etc.)
customizations are required, as the output of 'prtdiag'
differs on most systems. It works fine on Fujitsu-Siemens systems too.
Provided by: Wim Olivier, Senior Solaris/VERITAS Engineer, AL Indigo,
Johannesburg, South Africa
sunhw
INSTALLATION
1. Copy the script to $XYMONHOME/ext/xymon-prtdiag.ksh
2. Add it to the $XYMONHOME/etc/clientlaunch.cfg file
TEMPFILE=/$XYMONTMP/prtdiag.OUTPUT.$$ TEST=prtdiag COLOR="green"
if [ "$XYMONHOME" = "" ] then echo "XYMONHOME is not set... exiting" exit 1 fi
if [ ! "$XYMONTMP" ] # GET DEFINITIONS IF NEEDED, should never happen... then # echo "*** LOADING XYMON SETTINGS ***" . $XYMONHOME/etc/xymonclient.cfg # INCLUDE STANDARD DEFINITIONS fi
What is this doing?
PANIC="1" # GO RED AND PAGE AT THIS LEVEL
PLATFORM=
uname -i/usr/platform/$PLATFORM/sbin/prtdiag -v > $TEMPFILE RESULT=$? #echo $RESULT # # DETERMINE RED/YELLOW/GREEN # if [ "$RESULT" -ne 0 ] then COLOR="red" fiAT THIS POINT WE HAVE OUR RESULTS. NOW WE HAVE TO SEND IT TO
THE XYMSRV TO BE DISPLAYED...
MACHINE=
uname -nTHE FIRST LINE IS STATUS INFORMATION... STRUCTURE IMPORANT!
THE REST IS FREE-FORM - WHATEVER YOU'D LIKE TO SEND...
LINE="status $MACHINE.$TEST $COLOR
datecat $TEMPFILE"$RM -f $TEMPFILE
NOW USE THE XYMON COMMAND TO SEND THE DATA ACROSS
$XYMON $XYMSRV "$LINE" # SEND IT TO XYMONSRV
On 11/10/11 11:09 PM, Vernon Everett wrote:
Hi Roland
This is what I am using. I made a mistake though, I did not write it. I must have got confused with another script I wrote. (I confuse easily) Credit where due though, and credit goes to Wim Olivier (See comments) It also looks like it does pretty much the same as Gautier's script.
Only issue you may have, there is a bug in the picl daemon, and it sometime crashes. (Recent patching may have fixed this) When that happens, all bets are off, and results could vary. Restarting picl service normally fixes everything. Might want to add /system/picl to the list of monitored services. There are two good SMF monitoring scripts on the Xymonton page. http://www.xymonton.org/monitors:smf.sh and http://www.xymonton.org/monitors:smf2.ksh I prefer the 2nd one, obviously :-) but the first one is simpler to configure. The choice is yours.
Regards Vernon
#!/usr/bin/ksh
PURPOSE
This is an very simple script/test, but extremely useful.
It detects ANY hardware failure in Sun SPARC systems by using the
standard 'prtdiag' command. Prtdiag exits with a value of
'1' if there is an hardware error, otherwise with an exit code of '0'.
This way, no model/platform specific (i.e. V240/V890/15K, etc.)
customizations are required, as the output of 'prtdiag'
differs on most systems. It works fine on Fujitsu-Siemens systems too.
Provided by: Wim Olivier, Senior Solaris/VERITAS Engineer, AL Indigo,
Johannesburg, South Africa
sunhw
INSTALLATION
1. Copy the script to $BBHOME/ext/sunhw.sh
2. Add it to the $BBHOME/etc/bb-bbexttab file
3. Restart the BB client
BBPROG=sunhw; export BBPROG TEMPFILE=/$BBTMP/sunwh.OUTPUT.$$ TEST="sunhw" COLOR="green"
if [ "$BBHOME" = "" ] then echo "BBHOME is not set... exiting" exit 1 fi
if [ ! "$BBTMP" ] # GET DEFINITIONS IF NEEDED then # echo "*** LOADING BBDEF ***" . $BBHOME/etc/bbdef.sh # INCLUDE STANDARD DEFINITIONS fi PANIC="1" # GO RED AND PAGE AT THIS LEVEL
PLATFORM=
uname -i/usr/platform/$PLATFORM/sbin/prtdiag -v > $TEMPFILE RESULT=$? #echo $RESULT # # DETERMINE RED/YELLOW/GREEN # if [ "$RESULT" -ne 0 ] then COLOR="red" fiAT THIS POINT WE HAVE OUR RESULTS. NOW WE HAVE TO SEND IT TO
THE BBDISPLAY TO BE DISPLAYED...
MACHINE=
uname -nTHE FIRST LINE IS STATUS INFORMATION... STRUCTURE IMPORANT!
THE REST IS FREE-FORM - WHATEVER YOU'D LIKE TO SEND...
LINE="status $MACHINE.$TEST $COLOR
datecat $TEMPFILE"$RM -f $TEMPFILE
NOW USE THE BB COMMAND TO SEND THE DATA ACROSS
$BB $BBDISP "$LINE" # SEND IT TO BBDISPLAY
On 11 October 2011 10:12, Roland Soderstrom <rolands at logicaltech.com.au>wrote:
Hi,
I have used old good bb-prtdiag for many years. It works fairly well for a lot of old Sun machines but lack the new ones. The script is pretty nasty in nested loops and too many if's. It is not easy to update it to accommodate new machines.
Is there anyone out there that has a new version of it? Can't find any on xymonton. Or do you monitor SUN hw in another way than prtdiag -> xymonserver?
Otherwise I will rewrite it to support our new hw.
- Roland
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
Xymon mailing list Xymon at xymon.com http://lists.xymon.com/mailman/listinfo/xymon
An HTML attachment was scrubbed... URL: <http://lists.xymon.com/pipermail/xymon/attachments/20111018/a480db34/attachment.html>
participants (3)
-
everett.vernon@gmail.com
-
gbegin@csc.com
-
rolands@logicaltech.com.au