For elements that have redundant connections defined in the protocol, if the element switches connections, is there a way to detect this so we can throw an alarm? We have an element that occasionally fails over and would like better visibility on when that happens. We could also consider monitoring the API end point with something like the Generic Network Services protocol, but a way to detect the connection change is preferred.
Thanks!
Hi Jamie, unfortunately it's currently not possible to easily detect this. Feel free to suggest a new software feature so this can be monitored.
Things that could be done:
-Check the logging of the element, whenever there's a switch it will be logged. Downsides: logfiles are wrapped, if the file size reaches the maximum then the oldest lines will get overwritten. If log lines are written too fast then you'll see "cleaned stack" so you could miss a redundancy switch log line. These items makes it hard to rely on the logging and keep track of it so it's really not a good solution (and the logfile would need to be constant being read out)
-If it's a normal serial driver then a solution could be to trigger on timeout (to be tested if it's a trigger on "timeout" or trigger on "timeout after retries"), whenever that happens there will be a switch so a driver could keep track of this, though it's not ideal if one timeout is missed due to some reason then you're not sure what the active used connection is at this moment
-Create 2 groups with a fixed connection id. Send these groups regularly to see the connection state if a connection is in timeout or not. Note that you don't know which connection is the one that is currently used for the communication, this is only a way to check if one connection would be down.
Bottom line: at this moment it are attempts to try to keep track of when a switch happens and what the connection state is, but there's no waterproof solution yet that shows what connection id is currently used by the driver, for that we would need a new software feature.
Hi Jamie, if you’re adding a new feature request then most ideally the redundant used connection would be present somewhere in the general parameters. On one side it could be a single parameter that contains the connection id that is in use, but on the other side if we would ever have the support of double redundancy in an element (eg an snmp and a serial connection, both redundant, which isn’t possible yet) then the [Communication Info] table would seem better suited to contain that so this way it would be already future proof.
A new feature request is available here: https://community.dataminer.services/new-feature-suggestions/ability-to-monitor-active-connection-in-redundant-connection-setup/
Laurens… thanks for the answer and potential solutions. The root issue we’re having appears to be related to the occasional sluggishness of the API we’re interacting with that causes a connection shift. We’ve implemented the Generic Services Monitor to see if we can catch when the Webservice is lagging as an early warning indicator. I’ll look to add a feature request for tracking the connection as this would provide better visibility to the issue.