Hi all,
Exploring the new swarming feature, is any way to get a metric of how long the elements will be unavailable during the process?
Swarming | DataMiner Docs
Is it possible to make any estimations?
Here's a diagram which should make Bert's explanation more clear. Posting here as I cannot add pictures to a comment.
When talking about timing in the context of swarming, it's important to distinguish the time for one specific element to swarm and the timing for a group of elements to swarm.
Since bulk swarming will be the main use case, its focus is there. Element startup time is not relevant for a swarm action so its excluded: once the element is starting up, the relevant agents can already coordinate the next swarming element.
Hi Edson,
Some metrics are available for this on docs: https://docs.dataminer.services/user-guide/Reference/Metrics/swarming_elements_benchmarks.html
Is this what you were looking for?

It's maybe also useful to compare swarming time to element restart time, because a swarming action is basically an element restart with some extra overhead in transferring control. If an element takes a long time to restart, it will also take a long time to swarm.
Swarming one element would take about as much time as restarting one element.
The time for swarming multiple elements at once depends on how those elements get spread across multiple agents (more destination agents => faster availability). Even when swarming to one destination agent there is a certain degree of parallelism when starting up elements after the swarm.
Let me try to translate those metrics into human readable language:
The swarm action itself, of one element, takes less than 200 ms (154 ms). But then the element still needs to perform its regular startup, and that depends on the connector. Most connectors start within a few seconds. Then it might take some more time before all the data is retrieved again, you have to take that into account as well!
Now, if you swarm 100 elements in one go, then it depends on how many DMAs you swarm them to. Each DMA can handle 10 elements concurrently. So if you swarm 100 elements to 1 DMA, then it will take 10 times the time it times for 1 element. Assuming one element takes 2 seconds, that would be 20 seconds in total. If you have 2 DMAs handling those 100 elements, then you would only need 10 seconds.
In other words, in a swarming world, you want more DMAs and less elements per DMA in order to make your system more resilient.
PS: Not sure if my human readable language is that human readable after all 😉