Hi Community,
I just set up a 3 server setup for Cassandra.
Cassandra + Cassandra-Reaper was running fine on this first node.
Then I tried to start Cassandra and Cassandra Reaper on the second server.
The Cassandra-Reaper start fails, but I don\'t see any errors in the log file (even with debug level).
After I stopped Cassandra-Reaper on the first node, I can\'t start it there either.
The only thing I noticed in the log: Cassandra-Reaper connects not only to the local server, but also to the other server which is not specified in the Cassandra-Reaper config.
Is this expected?
nodetool status:
Datacenter: xyz
==================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.2.163 264.7 KiB 16 51.2% bb7e5e1b-92bf-4711-b20b-4a3f96327fe7 rack1
UN 192.168.2.162 305.52 KiB 16 48.8% ca386982-cca8-4c09-ad35-c63c475700e9 rack1
Parts of the Log:
DEBUG [main] c.d.d.c.Cluster - Starting new cluster with contact points [/192.168.2.163:9042]
...
DEBUG [main] c.d.d.c.H.STATES - [Control connection] established to /192.168.2.163:9042
INFO [main] c.d.d.c.p.DCAwareRoundRobinPolicy - Using data-center name \'xyz\' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)
INFO [main] c.d.d.c.Cluster - New Cassandra host /192.168.2.163:9042 added
INFO [main] c.d.d.c.Cluster - New Cassandra host /192.168.2.162:9042 addedDEBUG [main] c.d.d.c.H.STATES - [/192.168.2.163:9042] preparing to open 1 new connections, total = 2
DEBUG [main] c.d.d.c.H.STATES - [/192.168.2.162:9042] preparing to open 1 new connections, total = 1
DEBUG [MCC R1-nio-worker-1] c.d.d.c.Connection - Connection[/192.168.2.163:9042-2, inFlight=0, closed=false] Connection established, initializing transport
DEBUG [MCC R1-nio-worker-2] c.d.d.c.Connection - Connection[/192.168.2.162:9042-1, inFlight=0, closed=false] Connection established, initializing transport...
Log ends with:
INFO [main] i.c.m.j.JmxManagementConnectionFactory - Initializing JMX seed list for all clusters...
INFO [main] i.c.m.j.JmxManagementConnectionFactory - Initialized JMX seed list for all clusters.
DEBUG [main] o.e.j.u.c.ContainerLifeCycle - ServletHandler@37df14d1{STOPPED} added {crossOriginRequests==org.eclipse.jetty.servlets.CrossOriginFilter@2a4f8009{inst=false,async=true,src=EMBEDDED:null},AUTO}
DEBUG [main] o.e.j.u.c.ContainerLifeCycle - ServletHandler@37df14d1{STOPPED} added {[/*]/[]/[ERROR, FORWARD, REQUEST, ASYNC, INCLUDE]=>crossOriginRequests,POJO}
INFO [main] i.c.ReaperApplication - creating and registering health checks
INFO [main] i.c.ReaperApplication - creating resources and registering endpoints
Service status:
Active: failed (Result: exit-code)
Duration: 2.805s
Docs: http://cassandra-reaper.io/
Process: 19805 ExecStart=/usr/local/bin/cassandra-reaper (code=exited, status=1/FAILURE)
Main PID: 19805 (code=exited, status=1/FAILURE)
CPU: 8.285s
cassandra-reaper.yaml (partially):
segmentCountPerNode: 64
repairParallelism: DATACENTER_AWARE
repairIntensity: 0.9
scheduleDaysBetween: 7
repairRunThreadCount: 15
hangingRepairTimeoutMins: 30
storageType: cassandra
enableCrossOrigin: true
incrementalRepair: false
blacklistTwcsTables: true
enableDynamicSeedList: true
repairManagerSchedulingIntervalSeconds: 10
activateQueryLogger: false
jmxConnectionTimeoutInSeconds: 5
useAddressTranslator: false
maxParallelRepairs: 2# If jmx access is restricted to localhost, then configure to SIDECAR.
datacenterAvailability: SIDECAR
cassandra:
clusterName: \"xyz\"
contactPoints: [\"192.168.2.163\"]
keyspace: reaper_db
loadBalancingPolicy:
type: tokenAware
shuffleReplicas: true
subPolicy:
type: dcAwareRoundRobin
localDC:
usedHostsPerRemoteDC: 0
allowRemoteDCsForLocalConsistencyLevel: false
authProvider:
type: plainText
username: some
password: thing
autoScheduling:
enabled: true
initialDelayPeriod: PT15S
periodBetweenPolls: PT10M
timeBeforeFirstSchedule: PT5M
scheduleSpreadPeriod: PT6H
Any idea how to solve or further troubleshoot?
Thanks,
Felix
Thanks Wale!
I added the second available IP to contactPoints and inserted the cluster name in localDC.
When doing sudo journalctl -u cassandra-reaper.service
I can see:
Mar 20 13:42:16 xyz-cassandra-02 systemd[1]: Started cassandra-reaper.service – Reaper for Apache Cassandra.
Mar 20 13:42:16 xyz-cassandra-02 cassandra-reaper[25157]: Using reaper in target
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: No repair unit exists for 87f6a6c0-d158-11ee-b1>
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2052)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache.get(LocalCache.java:3943)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3967)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4952)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4958)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.storage.repairunit.CassandraRepairUnitDao.getRepairUnit(CassandraRepairUnitDao.java:121)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.RepairScheduleService.registerScheduleMetrics(RepairScheduleService.java:150)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.RepairScheduleService.lambda$registerRepairScheduleMetrics$0(RepairScheduleService.java:144)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.RepairScheduleService.registerRepairScheduleMetrics(RepairScheduleService.java:144)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.RepairScheduleService.(RepairScheduleService.java:52)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.RepairScheduleService.create(RepairScheduleService.java:57)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.ClusterRepairScheduler.(ClusterRepairScheduler.java:55)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.resources.ClusterResource.(ClusterResource.java:89)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.resources.ClusterResource.create(ClusterResource.java:108)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.ReaperApplication.run(ReaperApplication.java:212)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.ReaperApplication.run(ReaperApplication.java:87)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:59)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:98)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.dropwizard.cli.Cli.run(Cli.java:78)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.dropwizard.Application.run(Application.java:94)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.ReaperApplication.main(ReaperApplication.java:99)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: Caused by: java.lang.IllegalArgumentException: No repair unit exists for 87f6a6c0-d158-11ee-b1ce-71e1216ef784
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.storage.repairunit.CassandraRepairUnitDao.getRepairUnitImpl(CassandraRepairUnitDao.java:116)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.storage.repairunit.CassandraRepairUnitDao.access$000(CassandraRepairUnitDao.java:40)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.storage.repairunit.CassandraRepairUnitDao$1.load(CassandraRepairUnitDao.java:51)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.storage.repairunit.CassandraRepairUnitDao$1.load(CassandraRepairUnitDao.java:49)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3524)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2273)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2156)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2046)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: … 21 more
Mar 20 13:42:18 xyz-cassandra-02 systemd[1]: cassandra-reaper.service: Main process exited, code=exited, status=1/FAILURE
Mar 20 13:42:18 xyz-cassandra-02 systemd[1]: cassandra-reaper.service: Failed with result ‘exit-code’.
Mar 20 13:42:18 xyz-cassandra-02 systemd[1]: cassandra-reaper.service: Consumed 7.430s CPU time.
What’s next?
An unusual error indeed so lets check more of the underlying setup.
1) Stop reaper on all nodes
2) Drop the reaper_db key space on all nodes and create again with a replication factor of 2 or 3 if you have 3 nodes.
3) Ensure same reaper yml file on all nodes.
4) Start reaper.
What linux ditro, cassandra and reaper version are you using ?
Hi Wale,
sorry for the delay.
We are using latest Debian, Cassandra 4.1.4, Reaper 3.5.0
Indeed, droping reaper_db and recreate it solved the issue.
Thanks for you support!
Nice weekend,
Felix
I get the impression that this question has already been answered in the comments. If this is indeed the case, could you select this answer to indicate that the question can be closed?
A couple of things:
1) Any useful info in the system logs /var/log/messages when you try to start reaper on the first node ?
2) Reaper yml file should list all the node IPs in your cluster. ‘contactPoints:’
3) Reaper yml should indicate the datacenter in ‘localDC:’.