Skip to content
DataMiner DoJo

More results...

Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Search in posts
Search in pages
Log in
Menu
  • Blog
  • Questions
  • Learning
    • E-learning Courses
    • Open Classroom Training
    • Certification
      • DataMiner Fundamentals
      • DataMiner Configurator
      • DataMiner Automation
      • Scripts & Connectors Developer: HTTP Basics
      • Scripts & Connectors Developer: SNMP Basics
      • Visual Overview – Level 1
      • Verify a certificate
    • Tutorials
    • Video Library
    • Books We Like
    • >> Go to DataMiner Docs
  • Expert Center
    • Solutions & Use Cases
      • Solutions
      • Use Case Library
    • Markets & Industries
      • Media production
      • Government & defense
      • Content distribution
      • Service providers
      • Partners
      • OSS/BSS
    • DataMiner Insights
      • Security
      • Integration Studio
      • System Architecture
      • DataMiner Releases & Updates
      • DataMiner Apps
    • Agile
      • Agile Webspace
      • Everything Agile
        • The Agile Manifesto
        • Best Practices
        • Retro Recipes
      • Methodologies
        • The Scrum Framework
        • Kanban
        • Extreme Programming
      • Roles
        • The Product Owner
        • The Agile Coach
        • The Quality & UX Coach (QX)
    • DataMiner DevOps Professional Program
  • Downloads
  • More
    • Feature Suggestions
    • Climb the leaderboard!
    • Swag Shop
    • Contact
      • General Inquiries
      • DataMiner DevOps Support
      • Commercial Requests
    • Global Feedback Survey
  • PARTNERS
    • All Partners
    • Technology Partners
    • Strategic Partner Program
    • Deal Registration
  • >> Go to dataminer.services

Cassandra-Reaper connects to multiple nodes in sidecar mode and fails

Solved920 views29th May 2024Cassandra Cassandra Reaper
1
Felix Wesemeier [DevOps Catalyst]1.81K 20th March 2024 4 Comments

Hi Community,

I just set up a 3 server setup for Cassandra.

Cassandra + Cassandra-Reaper was running fine on this first node.

Then I tried to start Cassandra and Cassandra Reaper on the second server.

The Cassandra-Reaper start fails, but I don\'t see any errors in the log file (even with debug level).

After I stopped Cassandra-Reaper on the first node, I can\'t start it there either.

The only thing I noticed in the log: Cassandra-Reaper connects not only to the local server, but also to the other server which is not specified in the Cassandra-Reaper config.

Is this expected?

nodetool status:

Datacenter: xyz
==================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.2.163 264.7 KiB 16 51.2% bb7e5e1b-92bf-4711-b20b-4a3f96327fe7 rack1
UN 192.168.2.162 305.52 KiB 16 48.8% ca386982-cca8-4c09-ad35-c63c475700e9 rack1

Parts of the Log:

DEBUG [main] c.d.d.c.Cluster - Starting new cluster with contact points [/192.168.2.163:9042]

...

DEBUG [main] c.d.d.c.H.STATES - [Control connection] established to /192.168.2.163:9042

INFO [main] c.d.d.c.p.DCAwareRoundRobinPolicy - Using data-center name \'xyz\' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)
INFO [main] c.d.d.c.Cluster - New Cassandra host /192.168.2.163:9042 added
INFO [main] c.d.d.c.Cluster - New Cassandra host /192.168.2.162:9042 added

DEBUG [main] c.d.d.c.H.STATES - [/192.168.2.163:9042] preparing to open 1 new connections, total = 2
DEBUG [main] c.d.d.c.H.STATES - [/192.168.2.162:9042] preparing to open 1 new connections, total = 1
DEBUG [MCC R1-nio-worker-1] c.d.d.c.Connection - Connection[/192.168.2.163:9042-2, inFlight=0, closed=false] Connection established, initializing transport
DEBUG [MCC R1-nio-worker-2] c.d.d.c.Connection - Connection[/192.168.2.162:9042-1, inFlight=0, closed=false] Connection established, initializing transport

...

Log ends with:

INFO [main] i.c.m.j.JmxManagementConnectionFactory - Initializing JMX seed list for all clusters...
INFO [main] i.c.m.j.JmxManagementConnectionFactory - Initialized JMX seed list for all clusters.
DEBUG [main] o.e.j.u.c.ContainerLifeCycle - ServletHandler@37df14d1{STOPPED} added {crossOriginRequests==org.eclipse.jetty.servlets.CrossOriginFilter@2a4f8009{inst=false,async=true,src=EMBEDDED:null},AUTO}
DEBUG [main] o.e.j.u.c.ContainerLifeCycle - ServletHandler@37df14d1{STOPPED} added {[/*]/[]/[ERROR, FORWARD, REQUEST, ASYNC, INCLUDE]=>crossOriginRequests,POJO}
INFO [main] i.c.ReaperApplication - creating and registering health checks
INFO [main] i.c.ReaperApplication - creating resources and registering endpoints

Service status:

Active: failed (Result: exit-code)
Duration: 2.805s
Docs: http://cassandra-reaper.io/
Process: 19805 ExecStart=/usr/local/bin/cassandra-reaper (code=exited, status=1/FAILURE)
Main PID: 19805 (code=exited, status=1/FAILURE)
CPU: 8.285s

cassandra-reaper.yaml (partially):

segmentCountPerNode: 64
repairParallelism: DATACENTER_AWARE
repairIntensity: 0.9
scheduleDaysBetween: 7
repairRunThreadCount: 15
hangingRepairTimeoutMins: 30
storageType: cassandra
enableCrossOrigin: true
incrementalRepair: false
blacklistTwcsTables: true
enableDynamicSeedList: true
repairManagerSchedulingIntervalSeconds: 10
activateQueryLogger: false
jmxConnectionTimeoutInSeconds: 5
useAddressTranslator: false
maxParallelRepairs: 2

# If jmx access is restricted to localhost, then configure to SIDECAR.

datacenterAvailability: SIDECAR

cassandra:
clusterName: \"xyz\"
contactPoints: [\"192.168.2.163\"]
keyspace: reaper_db
loadBalancingPolicy:
type: tokenAware
shuffleReplicas: true
subPolicy:
type: dcAwareRoundRobin
localDC:
usedHostsPerRemoteDC: 0
allowRemoteDCsForLocalConsistencyLevel: false
authProvider:
type: plainText
username: some
password: thing
autoScheduling:
enabled: true
initialDelayPeriod: PT15S
periodBetweenPolls: PT10M
timeBeforeFirstSchedule: PT5M
scheduleSpreadPeriod: PT6H

Any idea how to solve or further troubleshoot?

Thanks,

Felix

Felix Wesemeier [DevOps Catalyst] Selected answer as best 29th May 2024
Wale Oguntoyinbo [SLC] [DevOps Advocate] commented 20th March 2024

A couple of things:
1) Any useful info in the system logs /var/log/messages when you try to start reaper on the first node ?
2) Reaper yml file should list all the node IPs in your cluster. ‘contactPoints:’
3) Reaper yml should indicate the datacenter in ‘localDC:’.

Felix Wesemeier [DevOps Catalyst] commented 20th March 2024

Thanks Wale!
I added the second available IP to contactPoints and inserted the cluster name in localDC.
When doing sudo journalctl -u cassandra-reaper.service
I can see:
Mar 20 13:42:16 xyz-cassandra-02 systemd[1]: Started cassandra-reaper.service – Reaper for Apache Cassandra.
Mar 20 13:42:16 xyz-cassandra-02 cassandra-reaper[25157]: Using reaper in target
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: No repair unit exists for 87f6a6c0-d158-11ee-b1>
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2052)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache.get(LocalCache.java:3943)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3967)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4952)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4958)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.storage.repairunit.CassandraRepairUnitDao.getRepairUnit(CassandraRepairUnitDao.java:121)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.RepairScheduleService.registerScheduleMetrics(RepairScheduleService.java:150)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.RepairScheduleService.lambda$registerRepairScheduleMetrics$0(RepairScheduleService.java:144)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.RepairScheduleService.registerRepairScheduleMetrics(RepairScheduleService.java:144)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.RepairScheduleService.(RepairScheduleService.java:52)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.RepairScheduleService.create(RepairScheduleService.java:57)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.service.ClusterRepairScheduler.(ClusterRepairScheduler.java:55)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.resources.ClusterResource.(ClusterResource.java:89)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.resources.ClusterResource.create(ClusterResource.java:108)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.ReaperApplication.run(ReaperApplication.java:212)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.ReaperApplication.run(ReaperApplication.java:87)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:59)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:98)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.dropwizard.cli.Cli.run(Cli.java:78)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.dropwizard.Application.run(Application.java:94)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.ReaperApplication.main(ReaperApplication.java:99)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: Caused by: java.lang.IllegalArgumentException: No repair unit exists for 87f6a6c0-d158-11ee-b1ce-71e1216ef784
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.storage.repairunit.CassandraRepairUnitDao.getRepairUnitImpl(CassandraRepairUnitDao.java:116)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.storage.repairunit.CassandraRepairUnitDao.access$000(CassandraRepairUnitDao.java:40)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.storage.repairunit.CassandraRepairUnitDao$1.load(CassandraRepairUnitDao.java:51)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at io.cassandrareaper.storage.repairunit.CassandraRepairUnitDao$1.load(CassandraRepairUnitDao.java:49)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3524)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2273)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2156)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2046)
Mar 20 13:42:18 xyz-cassandra-02 cassandra-reaper[25157]: … 21 more
Mar 20 13:42:18 xyz-cassandra-02 systemd[1]: cassandra-reaper.service: Main process exited, code=exited, status=1/FAILURE
Mar 20 13:42:18 xyz-cassandra-02 systemd[1]: cassandra-reaper.service: Failed with result ‘exit-code’.
Mar 20 13:42:18 xyz-cassandra-02 systemd[1]: cassandra-reaper.service: Consumed 7.430s CPU time.
What’s next?

Wale Oguntoyinbo [SLC] [DevOps Advocate] commented 20th March 2024

An unusual error indeed so lets check more of the underlying setup.
1) Stop reaper on all nodes
2) Drop the reaper_db key space on all nodes and create again with a replication factor of 2 or 3 if you have 3 nodes.
3) Ensure same reaper yml file on all nodes.
4) Start reaper.

What linux ditro, cassandra and reaper version are you using ?

Felix Wesemeier [DevOps Catalyst] commented 22nd March 2024

Hi Wale,
sorry for the delay.
We are using latest Debian, Cassandra 4.1.4, Reaper 3.5.0
Indeed, droping reaper_db and recreate it solved the issue.
Thanks for you support!
Nice weekend,
Felix

1 Answer

  • Active
  • Voted
  • Newest
  • Oldest
0
Marieke Goethals [SLC] [DevOps Catalyst]5.46K Posted 29th May 2024 0 Comments

I get the impression that this question has already been answered in the comments. If this is indeed the case, could you select this answer to indicate that the question can be closed?

Felix Wesemeier [DevOps Catalyst] Selected answer as best 29th May 2024
Please login to be able to comment or post an answer.

My DevOps rank

DevOps Members get more insights on their profile page.

My user earnings

0 Dojo credits

Spend your credits in our swag shop.

0 Reputation points

Boost your reputation, climb the leaderboard.

Promo banner DataMiner DevOps Professiona Program
DataMiner Integration Studio (DIS)
Empower Katas

Recent questions

Web Applications exception in Cube due to invalid certificate 0 Answers | 0 Votes
Redundancy Groups and Alarming – Duplicate Alarms 0 Answers | 0 Votes
Correlation Engine: “Test rule” doesn’t result in a hit, despite functional rule 1 Answer | 3 Votes

Question Tags

adl2099 (115) alarm (62) Alarm Console (82) alarms (100) alarm template (83) Automation (223) automation scipt (111) Automation script (167) backup (71) Cassandra (180) Connector (109) Correlation (69) Correlation rule (52) Cube (151) Dashboard (194) Dashboards (188) database (83) DataMiner Cube (57) DIS (81) DMS (71) DOM (140) driver (65) DVE (56) Elastic (83) Elasticsearch (115) elements (80) Failover (104) GQI (159) HTTP (76) IDP (74) LCA (152) low code app (166) low code apps (93) lowcodeapps (75) MySQL (53) protocol (203) QAction (83) security (88) SNMP (86) SRM (337) table (54) trending (87) upgrade (62) Visio (539) Visual Overview (345)
Privacy Policy • Terms & Conditions • Contact

© 2025 Skyline Communications. All rights reserved.

DOJO Q&A widget

Can't find what you need?

? Explore the Q&A DataMiner Docs

[ Placeholder content for popup link ] WordPress Download Manager - Best Download Management Plugin