Hello,
I am progressing with the alarm setup and now i am facing a a situation that i need to avoid. I am trying to use the Alarm Storm Prevention. There are multiple units sending out same alarm.
The filter is already created so i get a lot of identical alarms from different devices. All good so far.
In order to use alarm storm prevention on email, i navigated to the System Center -> System Settings -> notifications alarm storm prevention
Here i checked the email option and set the following values:
This would mean that if i get same alarm from multiple units in 15 seconds, there will be a storm alarm email.
With these settings, first 5 alarms come in my inbox. Then because they don't stop and threshold has been exceeded, alarm storm protections comes in and send the mail stating that there is alarm storm prevention, etc. All good.
Next step, the alarms come in a high number so the alarm storm is working. Then, after 10 seconds, no alarm comes in. Assuming that the system is still under 15 seconds, the protection is active so i won't get bulk alarms.
In my real scenario, i meet the following: after the times goes off and the alarm protection is no longer there i get 5 alarms and then the alarm protection message. This happens over and over as i have periods where i am getting a lot of alarms. This is not something i can predict.
For example, i got my last storm at 13:00:00. Then, next 15 seconds, all good. At 13:01:00 i get stormed again, at 13:04:00 again, etc.
How can i set the Alarm storm protection properly so i can get covered in most of the time?
There is another option that i think i can use but i do not know how to set it up correct.
How about this option ?
Would this be more helpful? How should i set it correctly so i can get storm prevention and cover this for a period of time.
I would set start delaying to 10, stop delaying to 5 and time range to 5 minutes. Would it be efficient ?
There is also a 3rd option involving alarm storm protection on shared filters ( System Center -> Users -> Marian -> Alerts -> Alarms storm prevention.
But this seems related to the 1st case from above. There is no actual way to tell the alarm storm prevention to look for a period of time.
What do you advise me ? How should i proceed to get an optimum "Alarm storm prevention" without getting tons of emails?
Thanks!
Marian
Not sure how you configured how you configured the alerting by email exactly, but I'm assuming you set an alert for every alarm matching a certain filter (as described here). As an alternative, you can try to set up alerting using a correlation rule (via Apps -> Correlation -> Add rule). You can set the exact same filter as you had before under 'alarm filter', and under 'actions' select the action 'send email'. The advantage of configuring it this way is that you only get one alarm for the entire group of alarms that matches the filter.
The second option you showed is about the alarm storm mode in Cube. This only has implications for the Cube interface and will not have any effect on any other part of the DataMiner system like alerting by email. The third option (System Center -> Users -> User Name -> Alerts -> Alarm storm prevention) is indeed similar to the first. The only difference being that it is per user instead of system-wide.
Hello,
I'll check the methods you guys indicated. Hopefully i'll get it right. I'll be back if i get into troubles.
Thank you!
A possible data-driven approach could be to have info on the average amount of alarms expected during major incidents – if you have this type of KPI, you might finetune the Cube settings around that and then take it from there