Hello Dojo,
Are there any requirements or specific guidelines to take into account to re-use an existing database cluster (Cassandra Cluster & Elasticsearch/Opensearch Cluster) for a new Dataminer system?
So the situation is that one existing DMS is using the database cluster consisting of 3-node Cassandra cluster and a 3-node Elasticsearch cluster. Now a separate DMS (consisting of one failover pair) will be re-using the same self-managed database cluster as its database.
Is there any pre-requisite for configuration on the new DMS to make sure the existing database cluster can be re-used in a safe and robust way? Does the new DMS need a different cluster name? I am thinking in terms of keyspace names in Cassandra or similar objects in Elasticsearch or Opensearch?
Thanks!
Hi Koen,
Re-using an existing database is perfectly viable. The main thing to keep in mind here is for the other DMS to indeed have a different keyspace (On the docs page here the "Keyspace prefix" box in Cube, translates to the DB tag in DB.xml, for both DB types).
As long as those 2 are different between the cluster, the data from each cluster won't interfere with the others.
The other important thing here to keep in mind is Database performance/resources, with more agents writing/reading to/from the same Database, there is a higher risk of one cluster impacting the performance of the other. With each database type having 3 node's they should be able to handle the smaller clusters here but it is something to test and keep evaluating as the load on the clusters or the clusters themselves grow.
Kind regards,
Michiel
Hi Koen,
I’m not aware of any specific guidelines beyond the documentation here, which states that “using a self-managed data storage architecture is not recommended.”
Despite that recommendation, it is technically possible for two or more DataMiner systems to share the same Cassandra (and OpenSearch) cluster. We have a few customers running this setup. In such cases each DMS must use a unique DB prefix (configured in DB.xml); as long as the prefixes differ, each system will read from and write to its respective tables.
The main challenge with this architecture is sizing the shared DB cluster (nodes, disk, CPU, memory, etc.) so it can cope with multiple DMS instances. Proper dimensioning depends on many variables — expected load, amount of trending data, element update frequency, desired performance, redundancy requirements, and so on — which makes it difficult to capture in a short, universal guideline.
If you want to proceed with a shared-cluster architecture, I’d start by determining the requirements for each DMS independently, then use our dimensioning tools to estimate cluster needs per system and combine those estimates into a final cluster sizing. Team Infrastructure can also assist with sizing and deployment recommendations.
Hope this helps!
 
						 
			