Question

Solved1.33K views22nd May 2024

4

Dennis Nacorda 182 8th May 2024 0 Comments

We are seeing a warning when running a system checks of the Cassandra Cluster

Category: Repair

Status: Warning

Description: Evaluate if a repair is needed

Suggested Action: The tables 'elementdata_xxxxx_xxx_xxxxx', 'objectreftreeelementtopdown', 'elementdata', 'objectreftransaction', 'analytics_alarmfocus', 'elementdata_xxxxx_xxx_xxxx', 'elementdata_yyyyy_yyy_yyyy', 'elementdata_xxxxx_xxx_xxxx', 'elementdata_yyyyy_yyy_yyyyy', 'analytics_changepoints_v1', 'maskstate', 'elementdata_yyyyy_yyy_yyyy', 'correlationslidingwindow_v2', 'elementdata_xxxxx_xxx_xxxxx', 'elementdata_xxxxx_xxx_xxxx', 'analytics_parameterinfo_v1', 'dveelementinfo', 'spectrum_max_id', 'view_build_status', 'analytics_changepoints_v2', 'datapoints', 'ai_cpalarms', 'elementdata_xxxxx_xxx_xxxx', 'elementlatch', 'elementdata_yyyyy_yyy_yyyy', 'objectreftreeelement', '', 'elementdata_xxxxx_xxx_xxxx', 'analytics_wavestream', 'cmigrationstatus', 'correlationmatchinfo_v2', 'elementdata_xxxxx_xxx_xxxxx' were not repaired within the tombstone removal period. Please increase the gc_grace_seconds or the frequency of the repairs. Repair checks for specific tables can be disabled in the Tables table.

The suggested action is to increase the gc_grace_seconds or the frequency of the repairs.

We are running a repair schedule with an interval of 7 days for the keyspaces in CassandraReaper. May I know what is the recommended repair frequency of the tables?

The tables elementdata_xxxxx_xxx_xxxxx and elementdata_yyyyy_yyy_yyyyy has a gc_grace_seconds of 864000 (10 days) which is higher than the repair frequency of 7 days, these tables are included in the suggested action column, may I know if these tables require an action?

Marieke Goethals [SLC] [DevOps Catalyst] Selected answer as best 22nd May 2024

1 Answer

You are viewing 1 out of 1 answers, click here to view all answers.

score 1 · Answer 1 · 2024-05-14T11:32:37+00:00

Hi Dennis,

There are no major things that can go wrong if you don't attend to this. Worst case you will see data that should have been deleted already or in other words to avoid 'zombie' data. This could occur when your RF is higher than one (multiple nodes hold a copy) and you run a delete that was not received by all nodes directly or through hint files. This is why you have gc_grace_seconds (time before tombstones can be removed), so if you run a repair you can let the other node know that there was a delete received after the insert. Having a high gc_grace_seconds means you keep the tombstones longer which gives you more grace time to run repairs to avoid this situation, but it can hurt perf and disk space as you might end up with lots of tombstones.

If your gc_grace_seconds is set to 10 days and your repair interval is 7d, then there should be no problem. Have a look into your repair history table to understand when the repairs were triggered on your nodes and if they were successful.

Cassandra Cluster Repair Frequency and GC Grace

1 Answer