Jump to content
phoenix

Remote Monitor Exceptions / Acknowledgement / Silence

Recommended Posts

Hi All,

First time posting so please be nice :) .

I've recently configured some HP Hardware monitoring (thanks Joe.McCall !) and now am getting alerts for some systems which is great. 

I now need to configure the following:

1. Drive failure should alert within 5 minutes. The remote monitor is set to run every 5 minutes, and is continuous.This takes care of this requirement but introduces an issue with ticket alerts being generated continuously. Changing the monitor to  'once'  would fix this but would introduce another issue which I am trying to solve (below)

2. The ticket should re-open once daily if the issue is still present. This would 'protect' us if a tech closed the ticket without the issue being resolved.

Technically I can address the above by creating two monitors, but then this would create two tickets for each alert.

 

As an alternative, I could 'silence' the alerts for a period of time, perhaps 24 hours by default? I'm not sure how to do this, but coming from a Zabbix/Nagios background, I assume this is pretty simple? I would like to be able to do this anyway, or at least create an exception because another issue I have with the hardware alerts is that some will always fail. We have some HP servers with non-genuine memory and HP SNMP will report this as a failure. Without removing the server from the search group the SNMP checks are applied to, how can I create an exception? It looks like exceptions are only possible with Internal Monitors and not Remote Monitors?

Any assistance would be greatly appreciated. Thank you!

Share this post


Link to post
Share on other sites

It sounds like you're alerting to a ticket immediately instead of calling a script. A script would be your answer for most of this, you can have a "ignore me for 24 hours" EDF on the server and your script can read that prior to making a ticket. Additionally you can have two monitors call the same script and result in one ticket. Additionally the script can be written to dump continuous notes into the script every five minutes or to not do anything if a ticket exists that has been updated in the last 8 hours. 

  • Like 1

Share this post


Link to post
Share on other sites

On a side note Dell has a different result for non genuine hardware than for actual failed hardware. If HP is the same you could tailor the monitor or the script to only react to actual failures.

Share this post


Link to post
Share on other sites
12 hours ago, MGreen said:

It sounds like you're alerting to a ticket immediately instead of calling a script. A script would be your answer for most of this, you can have a "ignore me for 24 hours" EDF on the server and your script can read that prior to making a ticket. Additionally you can have two monitors call the same script and result in one ticket. Additionally the script can be written to dump continuous notes into the script every five minutes or to not do anything if a ticket exists that has been updated in the last 8 hours. 

Awesome. Now I just need to work out how to create the script :) - Pointed me the right direction though, so thank you!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×