Hi,
Skyline squad Unit-X has a cluster of 2 DataMiner Agents for a few years.
Recently, our cluster acts suspicious. Namely, one of the agents is disconnected (agent 2).
However, it is connected for a brief moment when it syncs to the other agent (agent 1).
After the sync, it is disconnected again. Sometimes, agent 2 gets state Refused when connected to agent 1. When connected to agent 2, all is normal according to the agent states.
- I have checked XML files: DMS, maintancesettings, logging, ..
nothing weird or invalid found.
- I restarted agents few times. Didn't help either.
Could someone elaborate what is going on or contact me to assist me further?
IMO, The gathered information is too sensitive to post here, so DM me for link.
Thanks in advance
This can happen when a SLNet call fails, we have seen some issues related to this in the early 10.2 releases however they should all be fixed in later versions
In case something similar happens again, you can check the SLNet.txt log files from both sides to see which call/KPI is causing the disconnect (e.g. callback timeout) and take a logcollector package with SLNet dump to look into the issue in more details
Did you check the IIS bindings? Enabling the HTTP binding can do the trick for you momentarily (I believe this is due to some structural changes in the IT management side, relying solely on HTTPS).
Let me know if that helps.