We have a cluster with 3 DMA's, we have recently been having network hits to one DMA.
When this occurrs the DMA will reconnect but after several hits it will stay down and not reconnect.
The only current way to re-enable the connectivity is to restart the Dataminer Software.
Is ther a cleaner way to restart the Cluster synchronisation to enable the full cluster to operate ?
Hi Neil,
having your DMAs disconnect that often is not a healthy situation, what i think happens is that SLNet starts to be overloaded.
Each time a DMA disconnects it will try to connect to the entire cluster again after x time but then it needs to load the entire config & caches from all the DMAs.
If this happens too often and too fast after each other SLNet is actually overflooded and will start go into a "protection" mode that it stops trying to reconnect to the DMAs.
I believe my explanation in this post is also relevant to this question:
https://community.dataminer.services/question/disconnections-in-a-dma-cluster