With SRM we have a lot of alarm updates due to the view/service impact that changes at the start and end of bookings. Due to this, we get a lot of notices that the alarm history exceeds 100 alarms. We have a Cassandra and Elastic Cluster DB, so the alarms are now stored in Elastic and we have the default threshold for alarms per parameter. Is it now safe with Elastic DB to set the 'recurring' attribute to false to prevent us from having all those notices or is there still a problem with having such large alarm trees?
Hi Alberto,
Information events and alarms are very similar, the main difference is that information events mostly don’t have a tree and are just single events. There is a lot of value in having the updates in the alarm tree as with this we can also track at what time what views/services are impacted by retrieving the alarm tree (by retrieving the history of the alarm you can track back these things). The problem with large alarm trees goes back to when alarms were still stored in Cassandra and large alarm trees would have a negative impact on the performance of the system. Now that alarms are stored in Elastic, I’m wondering if large alarm trees are still a problem. If it would not be a problem anymore we can disable the notifications that warn us about this.
Hi Michiel,
Not a direct answer to your question, but did you have a look already at the alarm squashing feature? This feature allows you to group consecutive alarm events without a severity change into a consolidated event.
AFAIK alarm squashing feature will only reduce the load on SLNet and Cube, but will still keep all entries in DB. So the question would still remain if we can set then maybe with or without the squashing feature the ‘recurring’ attribute to false.
If this is normal within the standard cycle of the booking, could it work to log this as an info event, rather than an alarm?