Skip to content
DataMiner DoJo

More results...

Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Search in posts
Search in pages
Log in
Menu
  • Updates & Insights
  • Questions
  • Learning
    • E-learning Courses
    • Empower Replay: Limited Edition
    • Tutorials
    • Open Classroom Training
    • Certification
      • DataMiner Fundamentals
      • DataMiner Configurator
      • DataMiner Automation
      • Scripts & Connectors Developer: HTTP Basics
      • Scripts & Connectors Developer: SNMP Basics
      • Visual Overview – Level 1
      • Verify a certificate
    • Video Library
    • Books We Like
    • >> Go to DataMiner Docs
  • Expert Center
    • Solutions & Use Cases
      • Solutions
      • Use Case Library
    • Markets & Industries
      • Media production
      • Government & defense
      • Content distribution
      • Service providers
      • Partners
      • OSS/BSS
    • Agile
      • Agile Webspace
      • Everything Agile
        • The Agile Manifesto
        • Best Practices
        • Retro Recipes
      • Methodologies
        • The Scrum Framework
        • Kanban
        • Extreme Programming
      • Roles
        • The Product Owner
        • The Agile Coach
        • The Quality & UX Coach (QX)
    • DataMiner DevOps Professional Program
      • About the DevOps Program
      • DataMiner DevOps Support
  • Downloads
  • More
    • DataMiner Releases & Updates
    • Feature Suggestions
    • Climb the leaderboard!
    • Swag Shop
    • Contact
    • Global Feedback Survey
  • PARTNERS
    • All Partners
    • Technology Partners
    • Strategic Partner Program
    • Deal Registration
  • >> Go to dataminer.services

How necessary is Cassandra compaction as it requires a lot of resources?

Solved1.68K views10th July 2023Cassandra compaction
8
Edson Alfaro [SLC] [DevOps Advocate]1.38K 14th September 2020 0 Comments

How necessary is the Cassandra compaction as it requires a lot of resources and time?

During the compaction process, which usually takes around 10 to 12 hours for a 150GB database, the system gets slow.

What could happen in the near future if the Cassandra compaction is disabled?

Marieke Goethals [SLC] [DevOps Catalyst] Selected answer as best 10th July 2023

2 Answers

  • Active
  • Voted
  • Newest
  • Oldest
20
Michiel Vanthuyne [SLC] [DevOps Enabler]4.16K Posted 15th September 2020 1 Comment

Short answer: without compaction, Cassandra will perform less well, and you’ll run out of hard drive space.

Long answer: When a write comes in, it’s written to the commit log, and to the active Memtable for the table. Memtables are later flushed to disk, and that file is called an SSTable. SSTables are immutable – the data contained is never updated or deleted in place. Instead of updating the data in place, we write our changes to a new Memtable, and then a new SSTable. The compaction process merges the SSTables together. If there was an update or a delete, the newest value for the field is kept by compaction and is written to the new SSTable, and the older versions are discarded.

Also, entries for which the TTL has expired are only deleted on compaction.

So without compaction, your disk will fill up with immutable SSTable files containing data that may already have been overwritten or should be deleted.

Note that efficient compaction requires sufficient hard drive space. It is recommended to have at least half the size of the biggest Cassandra table available as free hard drive space to allow the compaction to be done in an efficient way.

Marieke Goethals [SLC] [DevOps Catalyst] Selected answer as best 10th July 2023
Alexander Gorbunov [SLC] [DevOps Advocate] commented 17th September 2020

From experience:
The amount of free disk space required for compaction is _the same_ as the size of the biggest table, not half the size. If less is available, “nodetool.bat compact” will terminate with an error. So it is very important to run compaction before the database size reaches one half of the disk size (assuming you have a dedicated disk for Cassandra).

2
Rodrigo Salvador [SLC]50 Posted 15th September 2020 0 Comments

Also note that compaction can be rescheduled as a windows job, so perhaps your customer can schedule the compaction to occur after peak hours, rather than at 1am as is standard.

Rodrigo Salvador [SLC] Answered question 15th September 2020
Please login to be able to comment or post an answer.

My DevOps rank

DevOps Members get more insights on their profile page.

My user earnings

0 Dojo credits

Spend your credits in our swag shop.

0 Reputation points

Boost your reputation, climb the leaderboard.

Promo banner DataMiner DevOps Professiona Program
DataMiner Integration Studio (DIS)
Empower Katas
Privacy Policy • Terms & Conditions • Contact

© 2025 Skyline Communications. All rights reserved.

DOJO Q&A widget

Can't find what you need?

? Explore the Q&A DataMiner Docs