Hello Dojo,
We are currently retrieving a large amount of data that we are going to be saving to a table. These entries all have unique names (no longer than 20 characters long). We are estimating having at least 500k rows. Is there a benefit to assigning unique Numeric Text Primary Keys or is the name in String format just as efficient?
Hi Gabriel. I think it's best to simply use the names that you already have as keys, especially since they are relatively short. The keys are stored in the table as strings anyway, even when they are numeric.
Assigning numbers to the entries probably means that you'll need to store a mapping somewhere. Setting, getting and updating that mapping will also have an impact on performance.
As a general remark, please be aware of the performance degradation you risk to have if your table holds 500k rows. So I'd like to challenge if there's really a need to save the table or can it be made volatile instead? Can you use a logger table instead of a regular table?
Big tables can cause Cassandra compactions issue because of (too) large partition sizes. Additionally, if you save the data, element startup (and DataMiner startup) will be slow.
Thank you for this information Jan. We are making the table volatile to help Cassandra as we have had previous issues with tombstones (Cassandra running on a Windows Server) so I guess the keys won’t be saved to Cassandra, but they will be used to pull the data to the Client.