Hello
Have an intermittent alarm from a device fan that I would like to mask.
The problem is that the text that comes in the "parameter description" of the alarm has on it a value that changes (an example: "Fan Alarm critical - RPM 15000, 16000 threshold 7500")
For these situations Dataminer thinks every time the RPM value changes is a new alarm and unmasks the alarm.
Any strategies to deal with these kind of situations that you recommend?
Thank you.
Best regards
Bruno Sousa
Hi Bruno,
Thanks for your question. I believe here the problem is the event that triggers the alarm. Based on the description of the alarm that you are retrieving, I could infer the following:
- The thresholds that will rise an alarm are defined in the device
- The severity of the alarm is defined in the device
When monitoring a device in DataMiner, the idea is to define the thresholds and severities in the alarm template assigned to the element. Here is important the the driver is able to retrieve the KPIs values directly from the device.
I have some questions:
- Is the driver able to retrieve the KPI (Fan RPM)? If so, I would suggest to enable the monitoring on this KPI (using an alarm template)
- In case the device is not able to provide this information directly, is it possible to receive (or parse) the event in a different way? For example, once the event is retrieved by the driver, is it possible to extract information from the description to identify uniquely the event (in this case Fan RPM)? In this case, the best option is to store this information in a table, where the display key of the row will be this unique identifier (e.g. 'Fan RPM') and another column could be used to store the severity of the alarm. Then you could easily enable the monitoring on this 'severity' column (and use masking if required)
Hi Bruno,
I was already with the impression that this event was coming from a SNMP trap. Could you let me know if you are working on a generic driver? or are you working on a driver that will monitor a specific device type?
For the latter, I assume that the MIB file of the vendor defines a fixed structure when generating SNMP traps, i.e. the structure of the description is similar across the events generated.
For example, for the description that you include in your question:
“Fan Alarm critical – RPM 15000, 16000 threshold 7500”
I assume that other events from the same device will have a similar structure:
Event Severity – Extra Comment
If this is the case, we could update the driver to process this event and populate a table that will contain three columns:
– Index: (most probably we will need to add an additional value to avoid duplicate keys)
– Severity:
– Comments:
Hello Miguel,
Is not a generic driver.. this happens on the Cisco RFGW-1D driver.
Thank you.
Best regards
Bruno Sousa
Hi Bruno, If these are short bursts in the fan speed, you can probably use hysteresis on your alarm template to prevent the alarm from being generated if it is a momentary spike. If the high rpm lasts for longer than the set hysteresis interval, the alarm will still be generated.
Hello Michiel,
Thank you for the suggestion.
Unfortunately in this case hystereris is not an option though..
BR
Bruno Sousa
Hello Miguel,
The information can only be retrieved through SNMP traps.
This trap is generic enough that can have more types of alarm than “Fan RPM”.
Through the information on the trap is possible to know that is “Fan” alarm.
What you suggest regarding another column.. it requires driver customization? Or is possible to do it other way?
Thank you.
BR
Bruno Sousa