I am reaching out to the DataMiner community to try and get an idea of what kind of database and table sizes users are seeing out there and any performance related feedback associated with different sizes that can be provided. Trying to see what might be considered as large dbase/table sizes and how other's systems are handling these different sizes. If we can get a large sample of different sized systems and how they are behaving this may assist the entire community in preparing for, and dealing, with any dbase size related potential issues. As an example, we have a particular protocol/element in all our systems that have a lot of avg trending active which results in these element's avg trend tables reaching sizes of 25 to 35G in file size and in some systems these have reached sizes of >100G. On some of these systems the full Backup is now taking over 5 hours to complete and I can only guess how long the Optimize process is taking and what kind of issues this is causing. All our systems are using enterprise grade SSDs but we are running DM and MySQL on a single Xeon Gold CPU DMA with 64G of ram. What is considered to be a large table? What are other users seeing?
Thanks in advance and I look forward to any and all responses.
Pitching in on the feedback from Alberto
For Cassandra it indeed highly depends on the HW infrastructure how "big" you can get
MySQL
Unfortunately for MySQL the main limit is MySQL itself, we identified the bottleneck are actually the amount of data you need to DELETE, which in turn also causes a long optimize functionality. Altough with fast enough disk speed the optimize should not have that much impact.
The delete queries are actually loaded in the cache and if you delete more and faster data than it can handle, MySQL will throttle, lose performance and eventually we had one case with 100 GB of trend data that it simply couldn't delete anymore
-> suggested size for MySQL is 2-4 GB/element based on your HW
Cassandra
This is an entire different story since this is linked with you HW infrastructure
To give some KPIs, we have systems running up to 300 GB of trend data (all elements together) which are running fine. However these are SSDs and non failover.
The problem with failover is that it needs to keep the data in sync and which is a very heavy CPU & memory consumption. We also have a weekly compaction to clean up all tombstones, this is less impacting but still causes a significant load.
The thing you need to pay attention to: the moment prunsrv.exe (Cassandra process) starts to use up all physical memory it will also in turn push our core processes such as SLNet & SLElement onto the disk.
If at that point, you don't have SSDs that can cope with the page file toggling, you can encounter noticeable performance issues up to even RTEs/crashes in the software
-> suggested size for Cassandra is max 30 GB/table when running on a 64 GB server, this should give some room for the maintenance actions
More RAM and very fast disks will allow you to increase this number but testing is advice (watch out for performance issues during compaction & repair)
Running Cassandra on a separate node you should be able to go to 300 GB on a single node
Cassandra Cluster + Elastic cluster
For bigger systems where we talk about significant amount of alarm & trend data the Cassandra cluster is the way to go, you can choose the amount of nodes and we run now Cassandra 4.0 which contains from Cassandra side huge improvements for time series data (such as our trending) BUT it needs to run on Linux. We have also noticed that Linux towards tombstone cleanup is way better than Windows.
->i don't have any figures from the field yet but the scalability size is really promising here + the big advantage is that you can use the nodes for ALL DMAs, which means we finally could step away from our 1 DB for 1 DMA setup
Tuning in to see what comes from the community, but if you have demanding trend requirements, moving to Cassandra is a no-brainer, check if you SSD are up to specs and migrate – DataMiner still supports systems on MySQL, but handling bigger amount of data would normally be way faster in NoSQL – that’s where Cassandra kicked in – and yes, you’ll need to consider your disk performance too (if your architecture uses local DB) or the performance of the separate DB (e.g. with Cassandra cluster). Your TAM will surely be able to evaluate and recommend the best course of action for your system: you may want to consider the introduction of ElasticSearch too