Our system is designed with many saved Tables that are used as a messaging mechanism. These tables have ~15 columns (all saved parameters) that define the message structure with some parameters being a string that can range in size from a few hundred bytes to 10s of thousands of bytes. A message gets added to the table for the element to process and once processed it is removed from the table. Under normal conditions rows(messages) get added then deleted at a relative high frequency, < sec, and hundreds, or sometimes thousands, of rows can get added at the same time one right after the other before they can get processed. Because these tables are saved parameters, every add/delete results in the data getting saved/deleted from the database. Due to issues with the DataMiner/MySQL interface mechanism, this eventually results in the database containing complete or partial rows that should have been deleted but did not, which then cause issues the next time the protocol is started because it loads the old and possibly incomplete data into the table for processing again causing system problems and sometime the element to crash on exception.
Question:
How will DataMiner/Cassandra handle this high frequency add/delete behavior? Does anyone foresee any potential issues with this in Cassandra? Cassandra testing results have been very promising and it appears Cassandra based system can handle this without issue, but I wanted to reach out to the experts to see if anyone sees any potential issues with this.
Thanks in advance for any feedback provided.
Jeff,
This has the potential of creating problems even for Cassandra. At such high frequency, we have seen that a a large number of tombstones (unprocessed records) start accumulating in Cassandra, which could lead to general database failures after element's restart.
I'm not sure how necessary saving the table columns is, but if all you are doing is using the tables as buffers, I'd recommend not to save any column and rather mark them as volatile, which will greatly improve the performance of Cassandra and DataMiner by not having to keep track of so many parameter states after element restart.
Further into the purpose of the mentioned tables, you might want to turn to Elastic for any fast lookups necessary. The indexing engine is a much suitable place for such operations.