繁体   English   中英

点燃集群不同的启动时间

[英]Ignite cluster different startup times

我在两次运行之间得到了非常不同的启动/连接时间。 我的集群有三个服务器节点。 我想从我的客户端节点(实际上位于三台服务器之一)上运行一些任务并缓存操作以进行测试。 但是,当我启动客户端时,可能最多需要五分钟才能真正正确连接。 在另一个客户端上启动,使用相同的客户端和相同的配置只需几秒钟。

在客户端节点启动花费很长时间的情况下,日志的差异是:

[13:35:31,649][INFO][ignite-update-notifier-timer][GridUpdateNotifier] Your version is up to date.
[13:37:21,794][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], node=eec8ea18-ded1-42cd-aec7-2af754644008]. Dumping pending objects that might be the cause: 
[13:37:21,794][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Ready affinity version: AffinityTopologyVersion [topVer=-1, minorTopVer=0]
[13:37:21,802][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Last exchange future: GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=0, lastExchangeTime=1526060111449, loc=true, ver=2.4.0#20180305-sha1:aa342270, isClient=true], topVer=16, nodeId8=eec8ea18, msg=null, type=NODE_JOINED, tstamp=1526060121610], crd=TcpDiscoveryNode [id=c74ff028-1676-4f1a-8c95-563763ea5875, addrs=[127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/192.168.122.1:47500, /127.0.0.1:47500, centos_node_2/192.168.0.162:47500], discPort=47500, order=7, intOrder=5, lastExchangeTime=1526060116544, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=0, lastExchangeTime=1526060111449, loc=true, ver=2.4.0#20180305-sha1:aa342270, isClient=true], topVer=16, nodeId8=eec8ea18, msg=null, type=NODE_JOINED, tstamp=1526060121610], nodeId=eec8ea18, evt=NODE_JOINED], added=true, initFut=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=true, hash=2024590198], init=true, lastVer=null, partReleaseFut=null, exchActions=ExchangeActions [startCaches=null, stopCaches=null, startGrps=[], stopGrps=[], resetParts=null, stateChangeRequest=null], affChangeMsg=null, initTs=1526060121650, centralizedAff=false, changeGlobalStateE=null, done=false, state=CLIENT, evtLatch=0, remaining=[830bbef7-0344-4955-bdf6-ff90f6d96602, b0105fdc-5298-4f80-94ae-2f1bbd8b42e8, c74ff028-1676-4f1a-8c95-563763ea5875], super=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null, hash=189344266]]
[13:37:21,803][WARNING][exchange-worker-#157%Test Cluster%][GridCachePartitionExchangeManager] First 10 pending exchange futures [total=0]
[13:37:21,806][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Last 10 exchange futures (total: 1):
[13:37:21,807][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] >>> GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=0, lastExchangeTime=1526060111449, loc=true, ver=2.4.0#20180305-sha1:aa342270, isClient=true], done=false]
[13:37:21,807][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending transactions:
[13:37:21,807][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending explicit locks:
[13:37:21,807][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending cache futures:
[13:37:21,807][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending atomic cache futures:
[13:37:21,808][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending data streamer futures:
[13:37:21,808][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Pending transaction deadlock detection futures:
[13:37:21,840][INFO][sys-#158%Test Cluster%][diagnostic] Exchange future waiting for coordinator response [crd=c74ff028-1676-4f1a-8c95-563763ea5875, topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0]]
Remote node information:
General node info [id=c74ff028-1676-4f1a-8c95-563763ea5875, client=false, discoTopVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], time=13:37:21.812]
Partitions exchange info [readyVer=AffinityTopologyVersion [topVer=14, minorTopVer=0]]
Last initialized exchange future: GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=830bbef7-0344-4955-bdf6-ff90f6d96602, addrs=[127.0.0.1, 192.168.0.161], sockAddrs=[/127.0.0.1:47500, /192.168.0.161:47500], discPort=47500, order=15, intOrder=9, lastExchangeTime=1526060055205, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], topVer=15, nodeId8=c74ff028, msg=Node joined: TcpDiscoveryNode [id=830bbef7-0344-4955-bdf6-ff90f6d96602, addrs=[127.0.0.1, 192.168.0.161], sockAddrs=[/127.0.0.1:47500, /192.168.0.161:47500], discPort=47500, order=15, intOrder=9, lastExchangeTime=1526060055205, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], type=NODE_JOINED, tstamp=1526060060363], crd=TcpDiscoveryNode [id=c74ff028-1676-4f1a-8c95-563763ea5875, addrs=[127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/192.168.122.1:47500, /127.0.0.1:47500, centos_node_2/192.168.0.162:47500], discPort=47500, order=7, intOrder=5, lastExchangeTime=1526059855998, loc=true, ver=2.4.0#20180305-sha1:aa342270, isClient=false], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=830bbef7-0344-4955-bdf6-ff90f6d96602, addrs=[127.0.0.1, 192.168.0.161], sockAddrs=[/127.0.0.1:47500, /192.168.0.161:47500], discPort=47500, order=15, intOrder=9, lastExchangeTime=1526060055205, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], topVer=15, nodeId8=c74ff028, msg=Node joined: TcpDiscoveryNode [id=830bbef7-0344-4955-bdf6-ff90f6d96602, addrs=[127.0.0.1, 192.168.0.161], sockAddrs=[/127.0.0.1:47500, /192.168.0.161:47500], discPort=47500, order=15, intOrder=9, lastExchangeTime=1526060055205, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], type=NODE_JOINED, tstamp=1526060060363], nodeId=830bbef7, evt=NODE_JOINED], added=true, initFut=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=true, hash=1568621067], init=true, lastVer=GridCacheVersion [topVer=0, order=1526059954164, nodeOrder=0], partReleaseFut=PartitionReleaseFuture [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], futures=[ExplicitLockReleaseFuture [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], futures=[]], TxReleaseFuture [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], futures=[]], AtomicUpdateReleaseFuture [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], futures=[]], DataStreamerReleaseFuture [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], futures=[]]]], exchActions=null, affChangeMsg=null, initTs=1526060227922, centralizedAff=false, changeGlobalStateE=null, done=false, state=CRD, evtLatch=0, remaining=[830bbef7-0344-4955-bdf6-ff90f6d96602], super=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null, hash=410898272]]
Communication SPI statistics [rmtNode=eec8ea18-ded1-42cd-aec7-2af754644008]
Communication SPI recovery descriptors: 
    [key=ConnectionKey [nodeId=eec8ea18-ded1-42cd-aec7-2af754644008, idx=0, connCnt=0], msgsSent=0, msgsAckedByRmt=0, msgsRcvd=2, lastAcked=0, reserveCnt=1, descIdHash=310748176]
Communication SPI clients: 
    [node=eec8ea18-ded1-42cd-aec7-2af754644008, client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=4, bytesRcvd=961, bytesSent=28, bytesRcvd0=853, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-4, igniteInstanceName=Test Cluster, finished=false, hashCode=474105904, interrupted=false, runner=grid-nio-worker-tcp-comm-4-#125%Test Cluster%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=2, sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], connected=true, connectCnt=0, queueLimit=4096, reserveCnt=1, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=2, sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], connected=true, connectCnt=0, queueLimit=4096, reserveCnt=1, pairedConnections=false], super=GridNioSessionImpl [locAddr=/127.0.0.1:47100, rmtAddr=/127.0.0.1:59666, createTime=1526060121775, closeTime=0, bytesSent=28, bytesRcvd=961, bytesSent0=0, bytesRcvd0=853, sndSchedTime=1526060121775, lastSndTime=1526060121786, lastRcvTime=1526060241812, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser@3f6752aa, directMode=true], GridConnectionBytesVerifyFilter], accepted=true]], super=GridAbstractCommunicationClient [lastUsed=1526060121786, closed=false, connIdx=0]]]
NIO sessions statistics:
>> Selector info [idx=4, keysCnt=1, bytesRcvd=961, bytesRcvd0=853, bytesSent=28, bytesSent0=0]
    Connection info [in=true, rmtAddr=/127.0.0.1:59666, locAddr=/127.0.0.1:47100, msgsSent=0, msgsAckedByRmt=0, descIdHash=310748176, msgsRcvd=2, lastAcked=0, descIdHash=310748176, bytesRcvd=961, bytesRcvd0=853, bytesSent=28, bytesSent0=0, opQueueSize=0]
Exchange future: GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], topVer=16, nodeId8=c74ff028, msg=Node joined: TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], type=NODE_JOINED, tstamp=1526060116548], crd=null, exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], topVer=16, nodeId8=c74ff028, msg=Node joined: TcpDiscoveryNode [id=eec8ea18-ded1-42cd-aec7-2af754644008, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /192.168.122.1:0, centos_node_2/192.168.0.162:0], discPort=0, order=16, intOrder=10, lastExchangeTime=1526060116518, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=true], type=NODE_JOINED, tstamp=1526060116548], nodeId=eec8ea18, evt=NODE_JOINED], added=true, initFut=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null, hash=1818763044], init=false, lastVer=null, partReleaseFut=null, exchActions=null, affChangeMsg=null, initTs=0, centralizedAff=false, changeGlobalStateE=null, done=false, state=null, evtLatch=0, remaining=[], super=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null, hash=1650837648]]
Local communication statistics:
Communication SPI statistics [rmtNode=c74ff028-1676-4f1a-8c95-563763ea5875]
Communication SPI recovery descriptors: 
    [key=ConnectionKey [nodeId=c74ff028-1676-4f1a-8c95-563763ea5875, idx=0, connCnt=-1], msgsSent=2, msgsAckedByRmt=0, msgsRcvd=1, lastAcked=0, reserveCnt=1, descIdHash=1306648390]
Communication SPI clients: 
    [node=c74ff028-1676-4f1a-8c95-563763ea5875, client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=0, bytesRcvd=8421, bytesSent=919, bytesRcvd0=8421, bytesSent0=853, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=Test Cluster, finished=false, hashCode=1972519349, interrupted=false, runner=grid-nio-worker-tcp-comm-0-#121%Test Cluster%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=1, sentCnt=2, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=c74ff028-1676-4f1a-8c95-563763ea5875, addrs=[127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/192.168.122.1:47500, /127.0.0.1:47500, centos_node_2/192.168.0.162:47500], discPort=47500, order=7, intOrder=5, lastExchangeTime=1526060116544, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=1, sentCnt=2, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=c74ff028-1676-4f1a-8c95-563763ea5875, addrs=[127.0.0.1, 192.168.0.162, 192.168.122.1], sockAddrs=[/192.168.122.1:47500, /127.0.0.1:47500, centos_node_2/192.168.0.162:47500], discPort=47500, order=7, intOrder=5, lastExchangeTime=1526060116544, loc=false, ver=2.4.0#20180305-sha1:aa342270, isClient=false], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=1, pairedConnections=false], super=GridNioSessionImpl [locAddr=/127.0.0.1:59666, rmtAddr=/127.0.0.1:47100, createTime=1526060121782, closeTime=0, bytesSent=919, bytesRcvd=8421, bytesSent0=853, bytesRcvd0=8421, sndSchedTime=1526060121782, lastSndTime=1526060241815, lastRcvTime=1526060241815, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser@619e0deb, directMode=true], GridConnectionBytesVerifyFilter], accepted=false]], super=GridAbstractCommunicationClient [lastUsed=1526060121792, closed=false, connIdx=0]]]
NIO sessions statistics:
>> Selector info [idx=0, keysCnt=1, bytesRcvd=8421, bytesRcvd0=8421, bytesSent=919, bytesSent0=853]
    Connection info [in=false, rmtAddr=/127.0.0.1:47100, locAddr=/127.0.0.1:59666, msgsSent=2, msgsAckedByRmt=0, descIdHash=1306648390, unackedMsgs=[GridDhtPartitionsSingleMessage, IgniteDiagnosticMessage], msgsRcvd=1, lastAcked=0, descIdHash=1306648390, bytesRcvd=8421, bytesRcvd0=8421, bytesSent=919, bytesSent0=853, opQueueSize=0]
[13:39:21,652][WARNING][main][GridCachePartitionExchangeManager] Failed to wait for initial partition map exchange. Possible reasons are: 
  ^-- Transactions in deadlock.
  ^-- Long running transactions (ignore if this is the case).
  ^-- Unreleased explicit locks.
[13:39:21,817][WARNING][exchange-worker-#157%Test Cluster%][diagnostic] Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], node=eec8ea18-ded1-42cd-aec7-2af754644008]. Dumping pending objects that might be the cause: 
[13:40:43,347][INFO][sys-#159%Test Cluster%][GridDhtPartitionsExchangeFuture] Received full message, will finish exchange [node=c74ff028-1676-4f1a-8c95-563763ea5875, resVer=AffinityTopologyVersion [topVer=16, minorTopVer=0]]
[13:40:43,354][INFO][sys-#159%Test Cluster%][GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=16, minorTopVer=0], err=null]
[13:40:43,395][INFO][main][IgniteKernal%Test Cluster] Performance suggestions for grid 'Test Cluster' (fix if possible)
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster]   ^-- Enable G1 Garbage Collector (add '-XX:+UseG1GC' to JVM options)
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster]   ^-- Specify JVM heap max size (add '-Xmx<size>[g|G|m|M|k|K]' to JVM options)
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster]   ^-- Set max direct memory size if getting 'OOME: Direct buffer memory' (add '-XX:MaxDirectMemorySize=<size>[g|G|m|M|k|K]' to JVM options)
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster]   ^-- Disable processing of calls to System.gc() (add '-XX:+DisableExplicitGC' to JVM options)
[13:40:43,396][INFO][main][IgniteKernal%Test Cluster]   ^-- Speed up flushing of dirty pages by OS (alter vm.dirty_expire_centisecs parameter by setting to 500)
[13:40:43,397][INFO][main][IgniteKernal%Test Cluster]   ^-- Reduce pages swapping ratio (set vm.swappiness=10)
[13:40:43,397][INFO][main][IgniteKernal%Test Cluster] Refer to this page for more performance suggestions: https://apacheignite.readme.io/docs/jvm-and-system-tuning
[13:40:43,397][INFO][main][IgniteKernal%Test Cluster] 
[13:40:43,397][INFO][main][IgniteKernal%Test Cluster] To start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
[13:40:43,398][INFO][main][IgniteKernal%Test Cluster] 
[13:40:43,401][INFO][grid-nio-worker-tcp-comm-1-#122%Test Cluster%][TcpCommunicationSpi] Established outgoing communication connection [locAddr=/192.168.0.162:40742, rmtAddr=/192.168.0.161:47100]
[13:40:43,403][INFO][main][IgniteKernal%Test Cluster] 

>>> +----------------------------------------------------------------------+
>>> Ignite ver. 2.4.0#20180305-sha1:aa342270b13cc1f4713382a8eb23b2eb7edaa3a5
>>> +----------------------------------------------------------------------+
>>> OS name: Linux 3.10.0-693.el7.x86_64 amd64
>>> CPU(s): 56
>>> Heap: 6.9GB
>>> VM name: 78579@centos_node_2
>>> Ignite instance name: Test Cluster
>>> Local node [ID=EEC8EA18-DED1-42CD-AEC7-2AF754644008, order=16, clientMode=true]
>>> Local node addresses: [centos_node_2/0:0:0:0:0:0:0:1%lo, centos_node_2/127.0.0.1, /192.168.0.162, /192.168.122.1]
>>> Local ports: TCP:10801 TCP:47101 

[13:40:43,406][INFO][main][GridDiscoveryManager] Topology snapshot [ver=16, servers=3, clients=1, CPUs=168, offheap=16.0GB, heap=19.0GB]
[13:40:43,406][INFO][main][GridDiscoveryManager] Data Regions Configured:
[13:40:43,406][INFO][main][GridDiscoveryManager]   ^-- default [initSize=4.0 GiB, maxSize=4.0 GiB, persistenceEnabled=false]
[13:40:43,413][INFO][main][GridDeploymentLocalStore] Class locally deployed: class TestCluster$1
[13:40:45,026][INFO][exchange-worker-#157%Test Cluster%][time] Started exchange init [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], crd=false, evt=DISCOVERY_CUSTOM_EVT, evtNode=c74ff028-1676-4f1a-8c95-563763ea5875, customEvt=CacheAffinityChangeMessage [id=c6771405361-ef621a9a-86e4-426a-958d-c53f0d9c0e25, topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], exchId=null, partsMsg=null, exchangeNeeded=true], allowMerge=false]
[13:40:45,028][INFO][exchange-worker-#157%Test Cluster%][time] Finished exchange init [topVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], crd=false]
[13:40:45,037][INFO][sys-#165%Test Cluster%][GridDhtPartitionsExchangeFuture] Received full message, will finish exchange [node=c74ff028-1676-4f1a-8c95-563763ea5875, resVer=AffinityTopologyVersion [topVer=16, minorTopVer=1]]
[13:40:45,039][INFO][sys-#165%Test Cluster%][GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], resVer=AffinityTopologyVersion [topVer=16, minorTopVer=1], err=null]
[13:40:48,545][INFO][grid-nio-worker-tcp-comm-2-#123%Test Cluster%][TcpCommunicationSpi] Established outgoing communication connection [locAddr=/192.168.0.162:35242, rmtAddr=/192.168.0.4:47100]
[13:40:48,597][INFO][main][GridDeploymentLocalStore] Class locally deployed: class TestCluster$2
[13:40:48,676][INFO][main][GridCacheProcessor] Stopped cache [cacheName=ignite-sys-cache]
[13:40:48,678][INFO][main][GridDeploymentLocalStore] Removed undeployed class: GridDeployment [ts=1526060443326, depMode=SHARED, clsLdr=sun.misc.Launcher$AppClassLoader@330bedb4, clsLdrId=85655405361-eec8ea18-ded1-42cd-aec7-2af754644008, userVer=0, loc=true, sampleClsName=org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionFullMap, pendingUndeploy=false, undeployed=true, usage=0]
[13:40:48,684][INFO][main][IgniteKernal%Test Cluster] 

>>> +---------------------------------------------------------------------------------+
>>> Ignite ver. 2.4.0#20180305-sha1:aa342270b13cc1f4713382a8eb23b2eb7edaa3a5 stopped OK
>>> +---------------------------------------------------------------------------------+
>>> Ignite instance name: Test Cluster
>>> Grid uptime: 00:00:05.289

群集配置是这样的:

<?xml version="1.0" encoding="UTF-8"?>

<!-- This file was generated by Ignite Web Console (05/11/2018, 23:29) -->

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:util="http://www.springframework.org/schema/util"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans.xsd
                           http://www.springframework.org/schema/util
                           http://www.springframework.org/schema/util/spring-util.xsd">
    <bean class="org.apache.ignite.configuration.IgniteConfiguration">
        <property name="igniteInstanceName" value="Test Cluster"/>

        <property name="discoverySpi">
            <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                <property name="ipFinder">
                    <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
                        <property name="addresses">
                            <list>
                                <value>192.168.0.4:47500..47510</value>
                                <value>192.168.0.161:47500..47510</value>
                                <value>192.168.0.162:47500..47510</value>
                            </list>
                        </property>
                    </bean>
                </property>

                <property name="ackTimeout" value="50000"/>
            </bean>
        </property>

        <property name="communicationSpi">
            <bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
                <property name="connectTimeout" value="600000"/>
            </bean>
        </property>

        <property name="networkTimeout" value="60000"/>
        <property name="networkSendRetryCount" value="10"/>

        <property name="dataStorageConfiguration">
            <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
                <property name="defaultDataRegionConfiguration">
                    <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                        <property name="initialSize" value="4294967296"/>
                        <property name="maxSize" value="4294967296"/>
                    </bean>
                </property>
            </bean>
        </property>

        <property name="peerClassLoadingEnabled" value="true"/>
        <property name="eventStorageSpi">
            <bean class="org.apache.ignite.spi.eventstorage.memory.MemoryEventStorageSpi">
            </bean>
        </property>
        <property name="failureDetectionTimeout" value="100000"/>
        <property name="clientFailureDetectionTimeout" value="100000"/>
    </bean>
</beans>

客户端节点连接为什么要花这么长时间? 为什么只有有时候呢?

谢谢你的帮助。

启动期间编辑的警告:

07:19:46.910 [main][1] WARN  org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi-[warning] Failure detection timeout will be ignored (one of SPI parameters has been set explicitly)
07:20:06.953 [main][1] WARN  org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi-[warning] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
07:20:06.977 [main][1] WARN  org.apache.ignite.spi.checkpoint.noop.NoopCheckpointSpi-[warning] Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
07:20:07.012 [main][1] WARN  org.apache.ignite.internal.managers.collision.GridCollisionManager-[warning] Collision resolution is disabled (all jobs will be activated upon arrival).
07:20:22.373 [main][1] WARN  org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi-[warning] Failure detection timeout will be ignored (one of SPI parameters has been set explicitly)
07:20:47.527 [main][1] WARN  org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi-[warning] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]

当新节点加入群集时,它需要完成当前的群集操作才能注册新的群集拓扑。 请注意以下警告。

[13:39:21,652][WARNING][main][GridCachePartitionExchangeManager] Failed to wait for initial partition map exchange. Possible reasons are: 
  ^-- Transactions in deadlock.
  ^-- Long running transactions (ignore if this is the case).
  ^-- Unreleased explicit locks.

最有可能您有一个长期运行的事务或未释放的锁。

如果您实际上没有在群集中执行任何操作,则该问题几乎可以肯定与网络问题和网络配置有关。 我会尝试减少超时并查看是否有帮助。

例如,您有ackTimeout=50000 这意味着客户端向服务器发送消息后,它将等待50秒以响应。 如果消息丢失,它将仅在50秒后重试-因此,单个网络错误将花费您近1分钟的时间。 将超时减少到较低的值应该有助于相对快速但不稳定的网络。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM