In an HTTP protocol we're implementing the feature to reboot/restore/upgrade a device.
What would be the best implementation to make sure all data gets repolled when the device comes back online and so the user doesn't have to wait until the timers get activated?
Suggestions:
Refresh Groups Implementation
Keep a flag in the driver when the reboot/restore/upgrade command was pushed to the device. Check the flag when the device comes back online in order to trigger a full repoll of the device.
This means that for every group that gets added in the future, you should also add them to this mechanism.
Also, the timers are going to keep putting groups on the stack that all are going to go in timeout until the device comes online.
Recover timer
Add an extra timer thread (default stopped) that gets activated when the reboot/restore/upgrade command has been pushed to the device. This will also stop the normal (fast/medium/slow) timers. When in the 'recover' timer we are able to poll a specific parameter (system information, etc.) again, we know the device is back online and we simply stop the recover timer and start the normal ones.
This will make sure that all groups that get added in the future get polled as well, by so less maintenance and less groups that will go into timeout anyway. Disadvantage is a start/stop timer implementation.
Or is there another way?
Hi,
That "recover timer" suggestion sounds like to default ping group implementation.
Most easy solution would seem to define a ping group in the driver, which is not used in a timer, and activate the slow poll on the element.
Whenever the device goes into timeout (e.g. due to the reboot) then the ping group will be executed and other polling will be halted by default. This would have to be tested what the regular timers do after a slow poll, but if the device communicates again and the timers are not directly executed then you can trigger on this response and execute the "restart timer" action. The reschedule="true" will then immediately let the timer (content) start again.
<Action id="101">
<Name>Restart with reschedule</Name>
<On id="1">timer</On>
<Type reschedule="true">restart timer</Type>
</Action>
I did a small test to see the behavior of the normal timer groups during and after a slow poll (tested with DataMiner 10.1.6.0):
A slow poll prevents the timer groups from being executed, however the groups seem to be added to the queue (not duplicated). The queue gets executed when the device goes out of timeout.
In other words, suppose that there is a timeout with slow poll and there is a 1 hour timer with group “1000”. This timer gets executed at 01:05, 02:05,…
-If the timer wants to execute during the slow poll, then group “1000” will be executed as soon as the device goes out of timeout. E.g. when the slow poll was between 02:00 and 02:15, then group “1000” will be executed at 02:15 because the timer should have been active at 02:05. Execution timings will be 01:05, 02:15, 03:05, 04:05,…
-If the timer was not executed during the slow poll, then group “1000” will not be executed when the device communicates again. E.g. when the slow poll was between 01:30 and 01:45 then group “1000” will not be executed at 01:45, that group will be executed at the normal time of 01:05, 02:05, 03:05, 04:05,…
So if you want all groups executed when the device communicates again to refresh all data immediately then you’ll either need to have an action that restarts and reschedules all timers (future developer will have to adapt this when adding a new timer), OR have an action that executes all groups (future developer will have to adapt this when adding a new group)