Hi Dojo,
Are there any recommended troubleshooting steps that can be operated when the following error type is shown in console?
Besides, when/if the condition clears, would the RTE auto-clear itself or is there any action needed?
Also, what is the impact on the cluster when a DMA has this error loaded?
Thanks
Hi,
The RTE refers to the NotificationThread of the SLDMS process. That thread is responsible to process the notification messages to be executed (e.g. for an overview of the different message types, see here ) .
As this thread is typically interacting with other processes when executing a notification message, there is a very large chance that another process is stuck (root cause) and the SLDMS process waits for the other process to continue.
If the condition of the RTE clears (the other process is not stuck anymore) then the RTE alarm will clear automatically.
The impact of this error is severe, as there are various notification types it means that modified files will not be synchronized anymore (e.g. Views), traps will no longer be distributed,...
It is best to take a logcollector package that includes a memory dump of the SLDMS process, but also other processes such as SLDataMiner, SLProtocol, SLScripting, SLSNMPManager, SLPort,... Based on the SLDMS memory dump it can be seen on what other process it is waiting and if a memory dump of that process also is included then it can be seen what that other process is stuck on, hence why so many other processes need to be included in the logcollector as it's not known beforehand what the related process is. Unfortunately it is not possible to analyze a memory dump by yourself, Techsupport will have to be contacted to further analyze this.
Regards,
Many thanks for your feedback and the list of the different message types, Laurens: we’ll progress as advised – marking the question as solved