I have a nightly backup process that starts at 19:00 and runs about 3 hours. I need to know if it failed to start on time, ran too long, or just plain failed.
I currently invoke a BB script at the start of the backup to set a value of 'clear' with a 3 hour 15 minute TTL. When the backup finishes I run another BB script that sets the proper color (green, yellow, or red) with an expiration time that calculates to 19:05 the next night.
And I alert on red or purple.
The problem I run into with enable/disable is that setting ALL to disabled and then re-enabling later looses the custom expiration time -- and 30 minutes after enabling the test, we start paging because the backup hasn't run/finished.
Am I missing something? Is there a better way?
Tom Kauffman NIBCO, Inc