I have a customer with failover setups where active and offline agents are in different geo location.
Each agent is setup with its local Cassandra node.
When the network interruption occurs, we saw the below pop up notice alarms in the alarm console as shown below, which is expected. (from online agents)
However, after the customer resolved the network connection issue, by right it should disappear but in this case, the notice alarms are still being there.
I tried to check the logs and I did not see any issue with failing heartbeat and sync issue as well.
When we check the failover status on the DMAs in the cluster, we see that on every offline agents, it was showing the status "No responders are available for the request" => which I believe, pointing to issue with connectivity or availability of the Cassandra node.
When I was checking the failover node tool status, everything is up and running and there is no errors happening on the DMAs as well.
To remove the status, on offline agent, I had to stop the Cassandra service, stop the offline DMA, restart the Cassandra service and restart the DMA again to resolve the issue.
Strangely after that, the notice alarms that were not cleared before are gone as well.
Can someone please help me why the notice alarms were only being cleared after the Cassandra service and DMA restart?
Thank you for the help.