Is it possible to have an alarm template where an alarm is generated if a parameter does not match another parameter?
User case:
We have a server with 2 NICs. The data rate of the TX should be (nearly) identical all the time. When we experienced an SFP fault, the data rate of 1 of the NICs drops off before completly failing. When it has completely failed, a separate alarm is raised for "disconnected". But if we could compare the 2 parameters, this would give us an earlier warning that there was an issue:
Hello Dave,
Thanks for proposing this use case! It sounds interesting and your idea fits perfectly in what we’re aiming at within the data insights team. We’ve recently developed an algorithm that learns relations between parameters and we are investigating how to leverage this to do the kind of multivariate anomaly detection that you describe.
I’m sorry to say that this functionality is not yet available in the software. Given you’ve taken the time to reach out, I’m assuming you’ve recently encountered such an issue. We’ld love to have a look at your data to see if we can immediately help you in another way, maybe through some smart baseline or other already implemented alarm features. If you can send us a screenshot or even better send the (RT and averaged) data to ai@skyline.be, I can guarantee that we’ll come back with more feedback within 24h.( FYI, fetching the trend data is done by right clicking on the trend graph and hitting “export to csv…”.)
Good luck and hoping to hear from you,
Dennis
Hi Dave,
to potentially help out the wider community, let me add a few words here as well!
First of all, thanks again for the interesting use case and the additional context you provided. As we communicated earlier, there is no generic multivariate anomaly detection tool available in our software, but your message motivates us to pursue this research further. Any use cases interesting for you, don’t hesitate to bring them up so we can add them to our road map.
With respect to the concrete problem you describe: you mention that 1 SFP drops off to 0kBps. As of 10.2.5 (main release 10.3), we do have a “Flatline” detection feature available (see https://community.dataminer.services/lets-talk-ai-automatic-detection-of-frozen-states/). As explained in that blog post, this feature can be used to trigger alarms whenever a parameter suddenly stops fluctuating. I hope this feature would have given you the early warning you were looking for, in this case.
Any questions or remarks, don’t hesitate to reach out!
All the best,
Dennis
Hi Dennis,
I’ll send further details via email as requested, but for the wider community I’ll add some more context.
The server in question is playing 2022-7 video, outputting to a red and a blue video network. As these NICs are outputting 2022-7 streams, the TX bandwidth is nearly identical. When we experienced a SFP failure, the red TX rate dropped off, but the blue TX rate remained.
We already had the network adapter status in the alarm template, so once it had actually failed it appeared in the alarm console, but comparing the TX Speeds on the trend graph you can see 1 SFP having dropped to 0kBps vs the working one.
Dave