我有一个 cassandra 3.9 集群。我从集群中的一个节点启动了修复。修复毫无进展。我看到启动的节点上的日志充满了这样的错误。
ERROR [GossipTasks:1] 2018-02-16 23:27:36,949 RepairSession.java:347 - [repair #cadf6f11-1342-11e8-8d73-6767c6890f70] session completed with the following error
java.io.IOException: Endpoint /**.**.**.52 died
at org.apache.cassandra.repair.RepairSession.convict(RepairSession.java:346) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:306) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:782) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.gms.Gossiper.access$800(Gossiper.java:66) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:181) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) [apache-cassandra-3.9.jar:3.9]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
另一方面,如果我查看声称已死亡的节点的日志,我会看到 3 个症状之一。
节点出现这样的异常。
错误 [ValidationExecutor:3] 2018-02-16 23:29:06,548 Validator.java:261 - 无法在 somekeyspace/sometable 上为 [repair #cac2bf50-1342-11e8-8d73-6767c6890f70 创建默克尔树,[(-353108710712, 6953137 ,-3495591103116433105],(1424707151780052485,1425479237398192865],(-3533012126945497873,-3531087107126953137],(142547923739 8192865,1429220273719165251],(-4991682772598302168,-4984938905452900436],(-7686750611814623539,-7685228552629222537],(755430 1216433235881,7559623046999138658],(334796420453180909, 342318143371667659],(-3538876023288368831,-3533012126945497873],(1409514567521922418,1424707151780052485],(539154601332107 3004,5393284101537339558], (590921410556013711,593440512568877190]]], /..**.43 (详情见日志)
错误 [ValidationExecutor:3] 2018-02-16 23:29:06,549 CassandraDaemon.java:226 - 线程 Thread[ValidationExecutor:3,1,main] 中出现异常 java.lang.RuntimeException: id = c8bf7540-1342-11e8-8d73-6767c6890f70 的父修复会话失败。 在 org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:377) ~[apache-cassandra-3.9.jar:3.9] 在 org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1313) ~[apache-cassandra-3.9.jar:3.9] 在 org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1222) ~[apache-cassandra-3.9.jar:3.9] 在 org.apache.cassandra.db.compaction.CompactionManager.access$700(CompactionManager.java:81) ~[apache-cassandra-3.9.jar:3.9] 在 org.apache.cassandra.db.compaction.CompactionManager$11.call(CompactionManager.java:844) ~[apache-cassandra-3.9.jar:3.9] 在 java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_91] 在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_91] 在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91] 在 java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
这是一个已知问题吗?
您能帮我指点一下您是如何解决这个问题的吗?我面临同样的端点死亡问题。