We have Active and backup DM agents in two different sites for redundancy sake. And we are monitoring all agents through Microsoft Platform driver.
Currently, we are experiencing the link flap issue. There is one packet drop in approximately every 20 ping. Each drop DM agent element sending timeout alarm.
We had increased the timeout of a single command (see attached file), but no luck.
Could you please tell us how to reduce the senstivity?
Hi Jeyaram,
I would suggest to reduce the 'Timeout of a single command' to e.g. 10,000, but increase the 'Number of retries' to 1 or 2.
The idea is that when you set a high value for the single command timeout, you wait for a long time for a packet that is lost anyway, it won't arrive. Setting low single command timeout but making a couple of retries will make sure the device is pinged several times, and if only one packet is lost, the element won't go into timeout state.
Also note that the element does not go in to timeout when there is a timeout on a single command. Only when the ‘single commands’ fail for the time specified in the value ‘the element goes into timeout state when it is not responding for …’, the element will indicate a timeout alarm.
In other words, it’s recommended to keep this ‘element timeout time’ on a high value like the 120 seconds you have, and only decrease the command timeout to e.g. 10000.