Hi,
When we restarted a DataMiner Agent, it got stuck in the startup menu. The SLNet.txt log shows the following:
2020-08-17 12:07:43.771|4|WaitUntilPortIsAvailable|Waiting for port 9004 to become available...
We tried unregistering and registering the DataMiner services and DLLs; however, when DataMiner and SLNet were restarted, the following message was displayed:
I cannot stop/start the SLNet service manually because the options are greyed out:
Last Friday (14/08/2020) it was possible to stop and restart the Agent.
I checked the DataMiner.xml file and compared it to previously dated files (found in the Recycle Bin) and there are no differences.
What additional steps can be done to get this Agent running again?
Thank you for the help.
I've had the same issue in the past.
The 'waiting for port 9004 to become available' log line is most likely a false positive (last log line that could be written to SLNet.txt before it crashed)
First step would be to check the slnetcrash.txt, In there it will most likely indicate what the issue is or where to start looking.
In my case this was because my DataMiner.xml was corrupt (missing a closing tag in some place).
Should be in the standard logging folder (C:/Skyline DataMiner/Logging).
Could not find anything. Maybe it was not due to an SLNet crash…
This occured when I stopped and restarted the DataMiner. The agent was working fine before that, it seems that it was during the restart.
Discussing this issue with other people, the general consensus is indeed that the issue is triggered by a corrupt xml.
This issue showed up when I was doing an investigation on a Central DB. I edited the db.xml to deactivate the Central DB, unregistered/re-registered services and dlls, and restarted DataMiner and SLNet. This time the agent came up. The Central DB is once again activated and the agent is working as expected.
You can check if a process is using the port that SLNet is waiting on. My favourite way to do this is with following powershell command:
> netstat -abon | findstr :9004
TCP 0.0.0.0:9004 0.0.0.0:0 LISTENING 15708
The last number here (15708) is the process ID of the process that has claimed the port. It could be possible that either another process has taken the port or that it is taken by SLNet.exe, but in status "Suspended".
When this happens, SLNet.exe is usually locked by one of the "WmiPrvSE.exe" processes. Which one exactly can be found with process explorer.
The issue can be fixed by killing the correct WmiPrvSE.exe process.
A more likely cause for the issue however is indeed a corrupt xml file somewhere, but this idea might be something to check as well.
Hi Laurens,
Thank you for the tip, if I encounter this issue again I will be sure to do this check right away.
Hi Jelle,
Where can I find the slnetcrash.txt?
Checked the ‘CrashDump’ and ‘MiniDump’ folders and there is nothing dated from today.