Good news! DataMiner 10.4.12 introduces a significant improvement in the stability and resilience of our platform, specifically addressing how we handle SLProtocol process disappearances. This update marks an important step forward in maintaining your system's uptime, reducing the impact on your operational environment when certain processes encounter issues.
Let's take a closer look at what's changing and how it benefits your system.
No more full DMA restarts
Previously, if an SLProtocol process suddenly disappeared, it would trigger a complete restart of the DataMiner Agent, affecting all elements within that DMA and often leading to significant downtime. In DataMiner 10.4.12, this behavior is updated for a more efficient response:
- Now, instead of restarting the entire DMA, a new SLProtocol process will start automatically whenever a process disappearance is detected.
- Elements previously hosted by the disappeared SLProtocol process will migrate to the new process—no full restart required.
- Impacted elements will automatically restart to ensure data is synchronized across SLElement, SLScripting, and other processes.
The result? Minimizing downtime and reducing disruptions when unexpected process failures occur.
Note that there will be a one-minute delay between the disappearance of an SLProtocol process and the creation of a new one, along with the restart of affected elements. This delay is consistent with how other processes, like SLAutomation, behave during a disappearance and is necessary to ensure that processes are not restarted while the DMA is in the process of shutting down.
Enhanced system visibility
To keep you informed about SLProtocol disappearances, DataMiner 10.4.12 introduces alarms that detail the process disappearance, including the Process ID (PID) and the number of affected elements. Here's an example of what the alarm will look like:
Process disappearance of SLProtocol.exe with PID <processId>; <x> elements hosted by the disappeared process have been restarted.
For users with Proactive Support, our team will be notified of any SLProtocol disappearances and will have access to the crashdump for further analysis. If you aren't using Proactive Support, we strongly recommend contacting TechSupport when an SLProtocol disappearance occurs. They will help investigate the crash and prevent similar incidents in the future.
Increased element distribution
In addition to improved process handling, we've also raised the default number of SLProtocol instances increases from 5 to 10. This change reduces the number of elements each process hosts, so if a restart occurs, it affects a smaller subset of elements.
By distributing elements across more instances, disruption is minimized even further and your system can achieve smoother system performance.
Improved logging for greater insight
DataMiner 10.4.12 also enhances your log files with valuable information that helps track system behavior over time. Here are the new fields you'll find in the "Element in Protocol" logging:
- "NormalStart" or "SLProtocolCrashRestart": This indicates whether an element was started normally or due to an SLProtocol process disappearance.
- Number of normal starts: The number of times a user has started the element since the DMA's start.
- Number of crash restarts: The number of times the element was restarted due to an SLProtocol process disappearance.
Additionally, you can find the process ID of the new SProtocol instance in the elementName.txt log file and the process ID of the previous instance in the elementName_BAK.txt log file.
Key benefits
In summary, here's what you'll gain from these updates:
- Minimized downtime: By avoiding a full DMA restart, only the affected elements are restarted, leading to quicker recovery.
- Detailed visibility: Enhanced logging and alarm notifications give you better insights into process failures and system recoveries.
- Improved troubleshooting: Detailed logs help you track restarts and provide more information for our TechSupport team.
We're committed to continuously improving DataMiner's resilience, and this update is a key part of that effort. If you have any questions or need further assistance, don't hesitate to reach out to our support team!