com.hazelcast.cp.exception.NotLeaderException 在 3 个节点上没有领导选举

Question

我在 3 个节点上使用 hazelcast-4.2.1 - 而不是企业 -d。 我尝试通过在每个节点上更改配置并重新启动 hazelcast 来启用具有 raft 共识协议的 cp-subsystem：

        <cp-subsystem>
        <cp-member-count>3</cp-member-count>
        <group-size>3</group-size>
        <session-time-to-live-seconds>300</session-time-to-live-seconds>
        <session-heartbeat-interval-seconds>5</session-heartbeat-interval-seconds>
        <missing-cp-member-auto-removal-seconds>14400</missing-cp-member-auto-removal-seconds>
        <fail-on-indeterminate-operation-state>false</fail-on-indeterminate-operation-state>
        <raft-algorithm>
            <leader-election-timeout-in-millis>2000</leader-election-timeout-in-millis>
            <leader-heartbeat-period-in-millis>5000</leader-heartbeat-period-in-millis>
            <max-missed-leader-heartbeat-count>5</max-missed-leader-heartbeat-count>
            <append-request-max-entry-count>100</append-request-max-entry-count>
            <commit-index-advance-count-to-snapshot>10000</commit-index-advance-count-to-snapshot>
            <uncommitted-entry-count-to-reject-new-appends>100</uncommitted-entry-count-to-reject-new-appends>
            <append-request-backoff-timeout-in-millis>100</append-request-backoff-timeout-in-millis>
        </raft-algorithm>
    </cp-subsystem>

app的代码很简单：

hazelcastInstance.getCPSubsystem().getLock().lock();

但是我在 3 个节点“Leader N/A”中的每一个上都收到了警告：

2021-11-23 11:45:45 INFO [MetadataRaftGroupManager] - [172.18.20.166]:5701 [dev] [4.2.1] CP Subsystem is waiting for 3 members to join the cluster. Current member count: 1
2021-11-23 11:45:48 INFO [ClusterService] - [172.18.20.166]:5701 [dev] [4.2.1]

Members {size:3, ver:3} [
        Member [172.18.20.166]:5701 - 66ef130c-a666-4ec0-8e99-cec4cd504bac this
        Member [172.18.20.167]:5701 - 5039c2ea-22dd-4f7c-a134-9367fab4e767
        Member [172.18.20.168]:5701 - 47dd80d4-b983-4ae7-b6bc-c2352416eada
]

2021-11-23 11:45:48 INFO [PartitionStateManager] - [172.18.20.166]:5701 [dev] [4.2.1] Initializing cluster partition table arrangement...
2021-11-23 11:45:49 INFO [RaftService] - [172.18.20.166]:5701 [dev] [4.2.1] RaftNode[CPGroupId{name='METADATA', seed=0, groupId=0}] is created with [RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'}, RaftEndpoint{uuid='5039c2ea-22dd-4f7c-a134-9367fab4e767'}, RaftEndpoint{uuid='66ef130c-a666-4ec0-8e99-cec4cd504bac'}]
2021-11-23 11:45:49 INFO [RaftNode(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1] Status is set to: ACTIVE
2021-11-23 11:45:49 INFO [AuthenticationMessageTask] - [172.18.20.166]:5701 [dev] [4.2.1] Received auth from Connection[id=12, /172.18.20.166:5701->/172.18.20.166:53635, qualifier=null, endpoint=[172.18.20.166]:53635, alive=true, connectionType=MCJVM, planeIndex=-1], successfully authenticated, clientUuid: 7c86a343-8b52-4652-8235-7c0dfdb2f5ad, client version: 4.2
2021-11-23 11:45:51 INFO [PreVoteRequestHandlerTask(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1] Granted pre-vote for PreVoteRequest{candidate=RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'}, nextTerm=1, lastLogTerm=0, lastLogIndex=0}
2021-11-23 11:45:51 INFO [VoteRequestHandlerTask(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1] Moving to new term: 1 from current term: 0 after VoteRequest{candidate=RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'}, term=1, lastLogTerm=0, lastLogIndex=0, disruptive=false}
2021-11-23 11:45:51 INFO [RaftNode(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1]

CP Group Members {groupId: METADATA(0), size:3, term:1, logIndex:0} [
        CPMember{uuid=47dd80d4-b983-4ae7-b6bc-c2352416eada, address=[172.18.20.168]:5701}
        CPMember{uuid=5039c2ea-22dd-4f7c-a134-9367fab4e767, address=[172.18.20.167]:5701}
        CPMember{uuid=66ef130c-a666-4ec0-8e99-cec4cd504bac, address=[172.18.20.166]:5701} - FOLLOWER this
]

2021-11-23 11:45:51 INFO [VoteRequestHandlerTask(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1] Granted vote for VoteRequest{candidate=RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'}, term=1, lastLogTerm=0, lastLogIndex=0, disruptive=false}
2021-11-23 11:45:51 INFO [AppendRequestHandlerTask(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1] Setting leader: RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'}
2021-11-23 11:45:51 INFO [RaftNode(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1]

CP Group Members {groupId: METADATA(0), size:3, term:1, logIndex:0} [
        CPMember{uuid=47dd80d4-b983-4ae7-b6bc-c2352416eada, address=[172.18.20.168]:5701} - LEADER
        CPMember{uuid=5039c2ea-22dd-4f7c-a134-9367fab4e767, address=[172.18.20.167]:5701}
        CPMember{uuid=66ef130c-a666-4ec0-8e99-cec4cd504bac, address=[172.18.20.166]:5701} - FOLLOWER this
]
2021-11-23 11:45:53 INFO [MetadataRaftGroupManager] - [172.18.20.166]:5701 [dev] [4.2.1] CP Subsystem is initialized with: [CPMember{uuid=47dd80d4-b983-4ae7-b6bc-c2352416eada, address=[172.18.20.168]:5701}, CPMember{uuid=5039c2ea-22dd-4f7c-a134-9367fab4e767, address=[172.18.20.167]:5701}, CPMember{uuid=66ef130c-a666-4ec0-8e99-cec4cd504bac, address=[172.18.20.166]:5701}]
2021-11-23 11:45:54 INFO [HealthMonitor] - [172.18.20.166]:5701 [dev] [4.2.1] processors=2, physical.memory.total=5.5G, physical.memory.free=844.4M, swap.space.total=2.0G, swap.space.free=1.2G, heap.memory.used=101.1M, heap.memory.free=857.4M, heap.memory.total=958.5M, heap.memory.max=958.5M, heap.memory.used/total=10.55%, heap.memory.used/max=10.55%, minor.gc.count=2, minor.gc.time=34ms, major.gc.count=2, major.gc.time=96ms, load.process=100.00%, load.system=100.00%, load.systemAverage=1.01, thread.count=47, thread.peakCount=53, cluster.timeDiff=0, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.client.query.size=0, executor.q.client.blocking.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operations.size=0, executor.q.priorityOperation.size=0, operations.completed.count=116, executor.q.mapLoad.size=0, executor.q.mapLoadAllKeys.size=0, executor.q.cluster.size=0, executor.q.response.size=0, operations.running.count=0, operations.pending.invocations.percentage=0.00%, operations.pending.invocations.count=0, proxy.count=1, clientEndpoint.count=10, connection.active.count=12, client.connection.count=10, connection.count=12
2021-11-23 11:46:12 WARN [Invocation] - [172.18.20.166]:5701 [dev] [4.2.1] Retrying invocation: Invocation{op=com.hazelcast.cp.internal.operation.DefaultRaftReplicateOp{serviceName='hz:core:raft', identityHash=309878934, partitionId=185, replicaIndex=0, callId=3364, invocationTime=1637649972379 (2021-11-23 11:46:12.379), waitTimeout=-1, callTimeout=60000, tenantControl=com.hazelcast.spi.impl.tenantcontrol.NoopTenantControl@0, groupId=CPGroupId{name='default', seed=0, groupId=6960}, op=com.hazelcast.cp.internal.session.operation.HeartbeatSessionOp{serviceName='hz:core:raftSession', sessionId=10}}, tryCount=250, tryPauseMillis=500, invokeCount=100, callTimeoutMillis=60000, firstInvocationTimeMs=1637649936372, firstInvocationTime='2021-11-23 11:45:36.372', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 05:00:00.000', target=[172.18.20.168]:5701, pendingResponse={VOID}, backupsAcksExpected=-1, backupsAcksReceived=0, connection=Connection[id=11, /172.18.20.166:5701->/172.18.20.168:51871, qualifier=null, endpoint=[172.18.20.168]:5701, alive=true, connectionType=MEMBER, planeIndex=0]}, Reason: com.hazelcast.cp.exception.NotLeaderException: RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'} is not LEADER of CPGroupId{name='default', seed=0, groupId=6960}. Known leader is: N/A
2021-11-23 11:46:12 WARN [Invocation] - [172.18.20.166]:5701 [dev] [4.2.1] Retrying invocation: Invocation{op=com.hazelcast.cp.internal.operation.DefaultRaftReplicateOp{serviceName='hz:core:raft', identityHash=1764520567, partitionId=185, replicaIndex=0, callId=3366, invocationTime=1637649972380 (2021-11-23 11:46:12.380), waitTimeout=-1, callTimeout=60000, tenantControl=com.hazelcast.spi.impl.tenantcontrol.NoopTenantControl@0, groupId=CPGroupId{name='default', seed=0, groupId=6960}, op=com.hazelcast.cp.internal.session.operation.HeartbeatSessionOp{serviceName='hz:core:raftSession', sessionId=10}}, tryCount=250, tryPauseMillis=500, invokeCount=100, callTimeoutMillis=60000, firstInvocationTimeMs=1637649936136, firstInvocationTime='2021-11-23 11:45:36.136', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 05:00:00.000', target=[172.18.20.168]:5701, pendingResponse={VOID}, backupsAcksExpected=-1, backupsAcksReceived=0, connection=Connection[id=11, /172.18.20.166:5701->/172.18.20.168:51871, qualifier=null, endpoint=[172.18.20.168]:5701, alive=true, connectionType=MEMBER, planeIndex=0]}, Reason: com.hazelcast.cp.exception.NotLeaderException: RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'} is not LEADER of CPGroupId{name='default', seed=0, groupId=6960}. Known leader is: N/A
2021-11-23 11:46:12 WARN [Invocation] - [172.18.20.166]:5701 [dev] [4.2.1] Retrying invocation: Invocation{op=com.hazelcast.cp.internal.operation.DefaultRaftReplicateOp{serviceName='hz:core:raft', identityHash=1136581224, partitionId=185, replicaIndex=0, callId=3416, invocationTime=1637649972878 (2021-11-23 11:46:12.878), waitTimeout=-1, callTimeout=60000, tenantControl=com.hazelcast.spi.impl.tenantcontrol.NoopTenantControl@0, groupId=CPGroupId{name='default', seed=0, groupId=6960}, op=com.hazelcast.cp.internal.session.operation.HeartbeatSessionOp{serviceName='hz:core:raftSession', sessionId=10}}, tryCount=250, tryPauseMillis=500, invokeCount=100, callTimeoutMillis=60000, firstInvocationTimeMs=1637649936875, firstInvocationTime='2021-11-23 11:45:36.875', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 05:00:00.000', target=[172.18.20.167]:5701, pendingResponse={VOID}, backupsAcksExpected=-1, backupsAcksReceived=0, connection=Connection[id=10, /172.18.20.166:5701->/172.18.20.167:36039, qualifier=null, endpoint=[172.18.20.167]:5701, alive=true, connectionType=MEMBER, planeIndex=0]}, Reason: com.hazelcast.cp.exception.NotLeaderException: RaftEndpoint{uuid='5039c2ea-22dd-4f7c-a134-9367fab4e767'} is not LEADER of CPGroupId{name='default', seed=0, groupId=6960}. Known leader is: N/A

并inovocation with com.hazelcast.cp.exception.NotLeaderException将调用泛洪到日志末尾。

问题：如何设置 cp-subsystem 为具有 3 个节点的 raft 启用？

Answer 1

如果重新启动集群的所有成员，它会更改 groupId，因为 state 在没有持久化模式的情况下不一致，仅在企业版中。 而且您还需要重新启动客户端。 https://github.com/hazelcast/hazelcast/issues/17436

com.hazelcast.cp.exception.NotLeaderException 在 3 个节点上没有领导选举

问题描述

1 个解决方案

解决方案1
0 2021-12-10 10:15:24

com.hazelcast.cp.exception.NotLeaderException 在 3 个节点上没有领导选举

问题描述

1 个解决方案

解决方案1 0 2021-12-10 10:15:24

解决方案1
0 2021-12-10 10:15:24