繁体   English   中英

点燃集群反复失败

[英]Ignite cluster repeatedly failing

我们的 ignite 服务器显示以下错误。

我们不执行任何查询或 Ignite 任务,我们只是使用缓存的放置和获取功能。

我们确实启用了持久性。

我们确实连接了 Ignite Visor,它在集群拓扑或任何缓存中没有显示任何异常。

对此错误的任何见解将不胜感激。

[15:01:03,943][SEVERE][sys-stripe-4-#5][FailureProcessor] A critical problem with persistence data structures was detected. Please make backup of persistence storage and WAL files for further analysis. Persistence storage path: null WAL path: /ignite/wal WAL archive path: /ignite/walarchive

[15:01:03,946][SEVERE][sys-stripe-4-#5][FailureProcessor] No deadlocked threads detected.
....

[15:01:04,135][SEVERE][sys-stripe-4-#5][] JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is corrupted [pages(groupId, pageId)=[IgniteBiTuple [val1=-2113909708, val2=1125985806188550]], msg=Runtime failure on search row: SearchRow [key=KeyCacheObjectImpl [part=20, val=2AC8D168CC86654C1879E6999D1CCFD7, hasValBytes=true], hash=741925420, cacheId=0]]

编辑,服务器端堆栈跟踪:

[18:40:50,919][SEVERE][sys-stripe-4-#5][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is corrupted [pages(groupId, pageId)=[IgniteBiTuple [val1=-2113909708, val2=1125985806188550]], msg=Runtime failure on search row: SearchRow [key=KeyCacheObjectImpl [part=20, val=2AC8D168CC86654C1879E6999D1CCFD7, hasValBytes=true], hash=741925420, cacheId=0]]]]
class org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is corrupted [pages(groupId, pageId)=[IgniteBiTuple [val1=-2113909708, val2=1125985806188550]], msg=Runtime failure on search row: SearchRow [key=KeyCacheObjectImpl [part=20, val=2AC8D168CC86654C1879E6999D1CCFD7, hasValBytes=true], hash=741925420, cacheId=0]]
    at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.corruptedTreeException(BPlusTree.java:6139)
    at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1953)
    at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1765)
    at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1748)
    at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:2794)
    at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:441)
    at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2342)
    at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2589)
    at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:2049)
    at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1866)
    at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1725)
    at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3228)
    at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:143)
    at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:284)
    at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:279)
    at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1151)
    at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:592)
    at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393)
    at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319)
    at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:110)
    at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:309)
    at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1908)
    at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1529)
    at org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:242)
    at org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1422)
    at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55)
    at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:569)
    at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Record is too long [capacity=134217728, size=134219738]
    at org.apache.ignite.internal.processors.cache.persistence.wal.SegmentedRingByteBuffer.offer0(SegmentedRingByteBuffer.java:214)
    at org.apache.ignite.internal.processors.cache.persistence.wal.SegmentedRingByteBuffer.offer(SegmentedRingByteBuffer.java:193)
    at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FileWriteHandleImpl.addRecord(FileWriteHandleImpl.java:243)
    at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:858)
    at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:811)
    at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.logUpdate(GridCacheMapEntry.java:4338)
    at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.update(GridCacheMapEntry.java:6492)
    at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:6244)
    at org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:5928)
    at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:4021)
    at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$5700(BPlusTree.java:3915)
    at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:2042)
    at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1920)
    ... 27 more

该问题是由插入的记录的大小大于实际 WAL 缓冲区大小引起的。

Ignite 在将序列化的 WAL 记录写入 WAL 文件之前使用 WAL 缓冲区来存储它们。 因此,每个 WAL 记录的大小应该小于 WAL 缓冲区的实际大小。

默认情况下,WAL 缓冲区大小等于配置的WAL 段大小属性。 但是如果IGNITE_WAL_MMAP被禁用,那么 WAL 缓冲区大小将受到WAL 缓冲区大小属性的限制,其默认值为WAL segment size / 4值。

作为解决方法,您可以尝试使用上述属性增加 WAL 缓冲区大小。 有关更多详细信息,请参见此处

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM