The Cassandra Cluster migrator has finished running on a production DMA (release 10.2 CU6) with all the tables displaying 'Complete' with the exception of the alarm table shown as 'Incomplete':
Inside the database folder, the contents of the AlarmFailedTokenRanges file has this content:
I am looking for advise on these questions:
Is this something we need to worry about?
What else should I check?
If we press the Restart for the alarm, would it mean that we will have to wait again more than 24 hours to finish?
Note that we haven't clicked the 'Finalize Migration' button yet. Will do it tomorrow under a short maintenance window.
Thanks!
Hey Paulo,
The failedTokenRanges contains the ranges where there was an issue in migrating the alarms, effectively the alarms in those ranges have NOT been migrated. This could be due to a temporary loss of connection or because the data in there could not be read (due to tombstones).
Clicking restart will cause the system to try the migration again ONLY on those failed ranges, splitting the ranges into smaller ranges again to avoid running into issues related to tombstones. Thus increasing the likelihood of success.
Should the migration keep failing you should be able to use the ranges in these files to investigate the alarms involved and make a decision from there on what needs fixed or if they can be skipped.
Is it normal the screenshot says ‘0 rows failed’ for the Alarm table?
If you wish to see what data failed, you can use the following query:
SELECT * FROM alarm WHERE token(r) >= x AND token(r) <= y; where X is the first value of a line in the alarmFailedTokenRanges and Y is the second value.
But it's indeed better to first restart a few times, so if there's a corrupt row, it's narrowed down.