Skip to content
DataMiner DoJo

More results...

Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Search in posts
Search in pages
Log in
Menu
  • Updates & Insights
  • Questions
  • Learning
    • E-learning Courses
    • Empower Replay: Limited Edition
    • Tutorials
    • Open Classroom Training
    • Certification
      • DataMiner Fundamentals
      • DataMiner Configurator
      • DataMiner Automation
      • Scripts & Connectors Developer: HTTP Basics
      • Scripts & Connectors Developer: SNMP Basics
      • Visual Overview – Level 1
      • Verify a certificate
    • YouTube Videos
    • Solutions & Use Cases
      • Solutions
      • Use Case Library
    • Agility
      • Book your Agile Fundamentals training
      • Book you Kanban workshop
      • Learn more about Agile
        • Agile Webspace
        • Everything Agile
          • The Agile Manifesto
          • Best Practices
          • Retro Recipes
        • Methodologies
          • The Scrum Framework
          • Kanban
          • Extreme Programming
        • Roles
          • The Product Owner
          • The Agile Coach
          • The Quality & UX Coach (QX)
    • >> Go to DataMiner Docs
  • DevOps
    • About the DevOps Program
    • Sign up for the DevOps Pogram
    • DataMiner DevOps Support
    • Feature Suggestions
  • Swag Shop
  • Downloads
  • PARTNERS
    • All Partners
    • Technology Partners
    • Strategic Partner Program
    • Solutions
    • Deal Registration
  • Contact
    • Sales, Training & Certification
    • DataMiner Support
    • Global Feedback Survey
  • >> Go to dataminer.services

Cassandra average row size + writes/s or writes/month

Solved706 views14th July 2023
2
Gert Bracke 19th May 2022 0 Comments

For Mysql the average row size is somewhere between 100B and 200B as most rows are in the data_ .. and data_avg_.. tables

How does this compare to cassandra? Can somebody share a average rowsize on cassandra?

Or similar question does anybody have an idea how the number of  row writes/s on mysql compare to the number of row writes/s on cassandra (not including the replication) before and after migration

Marieke Goethals [SLC] [DevOps Catalyst] Selected answer as best 14th July 2023

1 Answer

  • Active
  • Voted
  • Newest
  • Oldest
0
Michiel Saelen [SLC] [DevOps Enabler]5.69K Posted 20th May 2022 2 Comments

Hi Gert,

There is no way to say with certainty that a table of 1G in MySQL will be 2G in Cassandra. Cassandra does not work like MySQL, it will put all incoming data on disk and will trigger compaction (to order/place data on disk more efficiently). The below screenshot gives you an idea of what this should look like for disk usage. Knowing how much space Cassandra will take on a disk will depend on a number of things. Another thing to know is that in MySQL, there is a restraint on time and the number of rows for your data. It could be that if you configured to keep alarms/trend data for one year, you actually only have one month of data because you are reaching the row limit. In Cassandra, we don’t have that restraint on the number of rows. Therefore, it could be that for one customer a MySQL system goes from 4G to 8G and for another, it goes from 50G to 500G.

Assuming you are moving towards the Cassandra/Elastic Cluster architecture where you have one Cassandra and Elastic cluster for your complete DMS. For the average trending (stored in Cassandra), we can give you an idea as this is the easiest to predict (fixed intervals to store it in DB). All have roughly 100B/entry (if you have large keys for your tables it might be more). The trend_data_short (5min) will be roughly 1MB/parameter (with a default TTL of 1 month). The trend_data_medium (1h) will be roughly 1MB/parameter (with a default TTL of 1 year). The trend_data_long (1d) will be roughly 40kB/parameter (with a default TTL of 1 year). For Real-time trending, it really depends on how fast your parameters are updating, but you can make an estimation if you know what parameters you are trending and how frequent they will update. The above should be an overestimation assuming that your trend data is being compacted correctly. Keep in mind that you also have other data in Cassandra such as saved parameters.


Figure: The disk usage of the trend_data_short (5min records) for one node in a three-node Cassandra cluster. The gaps in the trend are because the monitoring was not enabled during that time. This trend graph is given by the Apache Cassandra Cluster Monitor connector.

Marieke Goethals [SLC] [DevOps Catalyst] Selected answer as best 14th July 2023
Gert Bracke [DevOps Advocate] commented 20th May 2022

Thank you for your answer but my question is not about database size but about row size.

We are looking at aws for example they are charging $/m rows inserted and $/m rows ttl deleted. Currently the rows inserted are basicaly key, timestamp, value.
Is that the same for cassandra or are you storing for example timestamp { key1=value1, key2=value2…}
As that would be more efficient and less costly

Michiel Saelen [SLC] [DevOps Enabler] commented 20th May 2022

Hi Gert,

Maybe this was not completely clear in my answer, but the 100B/entry means 100B for every row in the Cassandra DB (for trend data). If you enable average trending for a parameter it will automatically store 5min, 1h and day records towards the DB. And of course, this means that every 5min, every hour and every day a record for that parameter will be written towards DB. The default TTLs are mentioned in the above answer, so you can expect after the TTL that data will expire at the same rate as it comes in.

Please login to be able to comment or post an answer.

My DevOps rank

DevOps Members get more insights on their profile page.

My user earnings

0 Dojo credits

Spend your credits in our swag shop.

0 Reputation points

Boost your reputation, climb the leaderboard.

Promo banner DataMiner DevOps Professiona Program
DataMiner Integration Studio (DIS)
Empower Katas
Privacy Policy • Terms & Conditions • Contact

© 2025 Skyline Communications. All rights reserved.

DOJO Q&A widget

Can't find what you need?

? Explore the Q&A DataMiner Docs