We are planning on running some experiments in our lab to see what behavior we can expect and what we can and can't do when databases are not available.
Inspired by the swarming option, we where wondering if Dataminer already uses the database to store the element.xml or will this be migrated when you enable swarming.
Hi Gerwin,
The element.xml files will indeed be migrated to the database when enabling swarming.
You can find more information in the instructions to enable swarming in our docs: Enabling the Swarming feature | DataMiner Docs. At step 5 the following is mentioned: "During DataMiner startup, the existing element XML files will be moved from the disk to the database."
So, before enabling swarming, all element config is stored in the element.xml files on disk, scattered around on the DMAs in your cluster. After enabling swarming, all element config is stored in the central storage (STaaS or Cassandra cluster) and can be accessed by any DMA at any point in time. This allows you to easily start any element on any DMA even if suddenly DMA "disappeared".
About your experiment, note that we already have "element data" in the database, even before swarming. An element has "config" (as I described it here above). This is the element name, the protocol/connector, the polling IP, etc. But then you also have "element data", that's parameters in the element which are saved. This element data is already stored in the database for ages. This means that before and after swarming, the behavior should be more or less the same when the database is not available.
This is what I'm expecting when the database is not available:
- DataMiner keeps on running and will generate an alarm indicating that the storage is not available.
- Running elements keep on running. You can still check the real-time values, and you can still make changes to the parameters (sets).
- You won't be able to fetch trending or history alarms because the storage is not available.
- All alarms and trending which is being generated during the time that the storage is down, is being buffered by DataMiner. This is being pushed to the storage whenever the storage becomes available again.
- You cannot start or restart an element during the outage because this needs to get information from the storage. Before swarming, element data needed to be fetched from the database. With swarming enabled, also the element config needs to be retrieved from the storage.
- A full DataMiner restart is obviously also not supported when the storage is down.
Let us know if you have any further questions, and don't hesitate to reach out if you have any further questions while experimenting with swarming!
PS: here you can also find some FAQ about swarming: Frequently asked questions | DataMiner Docs
Bert.