在过去一个月左右的时间里,我们频繁重启 Ignite 节点。请查找附件中的 Ignite 日志。很明显,JVM 内存不足(见“已用堆”),CPU 利用率上升,JVM 暂停时间变长,这会导致节点与集群中其他节点失去连接。这似乎会触发重启。
没有专门为setOnheapCacheEnabled设置值。根据我们的理解,默认值是 false。尽管如此,堆使用量仍然增加,并且节点重新启动。
[11:20:04,287][INFO][grid-timeout-worker-#38%ImportedCluster%][IgniteKernal%ImportedCluster]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=a66811bb, name=ImportedCluster, uptime=2 days, 20:58:22.594]
^-- Cluster [hosts=19, CPUs=89, servers=3, clients=16, topVer=31, minorTopVer=0]
^-- Network [addrs=[100.96.17.220, 127.0.0.1], discoPort=47500, commPort=47100]
^-- CPU [CPUs=16, curLoad=58.57%, avgLoad=2.84%, GC=5.6%]
^-- Heap [used=7948MB, free=1.11%, comm=8038MB]
^-- Outbound messages queue [size=176631]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=1, idle=15, qSize=0]
^-- Striped thread pool [active=14, idle=2, qSize=1090]
[11:20:04,287][INFO][grid-timeout-worker-#38%ImportedCluster%][IgniteKernal%ImportedCluster] FreeList [name=default##FreeList, buckets=256, dataPages=53711, reusePages=5711]
[11:20:04,287][INFO][grid-timeout-worker-#38%ImportedCluster%][IgniteKernal%ImportedCluster]
^-- sysMemPlc region [type=internal, persistence=false, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=99.21%, allocRam=40MB]
^-- default region [type=default, persistence=false, lazyAlloc=true,
... initCfg=256MB, maxCfg=6429MB, usedRam=662MB, freeRam=89.7%, allocRam=667MB]
^-- TxLog region [type=internal, persistence=false, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=40MB]
^-- volatileDsMemPlc region [type=user, persistence=false, lazyAlloc=true,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
[11:20:04,287][INFO][grid-timeout-worker-#38%ImportedCluster%][IgniteKernal%ImportedCluster]
Data storage metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Off-heap memory [used=663MB, free=90.15%, allocated=747MB]
^-- Page memory [pages=168760]
[11:20:04,935][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 600 milliseconds.
[11:20:05,600][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 615 milliseconds.
[11:20:06,313][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 663 milliseconds.
[11:20:06,953][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 590 milliseconds.
[11:20:07,671][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 668 milliseconds.
[11:20:08,406][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 684 milliseconds.
[11:20:09,100][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 645 milliseconds.
[11:20:09,728][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 578 milliseconds.
[11:20:10,374][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 596 milliseconds.
[11:20:11,182][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 757 milliseconds.
[11:20:12,013][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 782 milliseconds.
[11:20:12,664][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 600 milliseconds.
[11:20:13,293][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 580 milliseconds.
[11:20:13,861][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 518 milliseconds.
[11:20:14,419][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 508 milliseconds.
[11:20:15,025][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 556 milliseconds.
[11:20:15,639][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 564 milliseconds.
[11:20:16,242][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 553 milliseconds.
[11:20:16,949][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 657 milliseconds.
[11:20:17,574][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 575 milliseconds.
[11:20:18,181][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 557 milliseconds.
[11:20:19,340][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 578 milliseconds.
[11:20:19,982][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 592 milliseconds.
[11:20:20,602][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 570 milliseconds.
[11:20:21,348][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 696 milliseconds.
[11:20:22,049][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 651 milliseconds.
[11:20:22,698][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 599 milliseconds.
[11:20:23,250][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 502 milliseconds.
[11:20:23,868][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 568 milliseconds.
[11:20:24,536][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 618 milliseconds.
[11:20:25,218][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 632 milliseconds.
[11:20:25,821][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 553 milliseconds.
[11:20:26,432][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 561 milliseconds.
[11:20:27,067][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 581 milliseconds.
[11:20:28,214][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 1101 milliseconds.
[11:20:29,330][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 519 milliseconds.
[11:20:29,970][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 590 milliseconds.
[11:20:30,623][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 603 milliseconds.
[11:20:31,309][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 636 milliseconds.
[11:20:31,863][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 504 milliseconds.
[11:20:32,987][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 553 milliseconds.
[11:20:33,619][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 582 milliseconds.
[11:20:34,196][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 527 milliseconds.
[11:20:35,441][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 584 milliseconds.
[11:20:36,035][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 1155 milliseconds.
[11:20:36,036][SEVERE][sys-stripe-0-#1%ImportedCluster%][G] Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [workerName=ttl-cleanup-worker, threadName=ttl-cleanup-worker-#103%ImportedCluster%, blockedFor=14s]
[11:20:36,037][WARNING][sys-stripe-0-#1%ImportedCluster%][] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=ttl-cleanup-worker, igniteInstanceName=ImportedCluster, finished=false, heartbeatTs=1731304221348]]]
class org.apache.ignite.IgniteException: GridWorker [name=ttl-cleanup-worker, igniteInstanceName=ImportedCluster, finished=false, heartbeatTs=1731304221348]
at app//org.apache.ignite.internal.binary.streams.BinaryAbstractInputStream.readByteArray(BinaryAbstractInputStream.java:44)
at app//org.apache.ignite.internal.binary.BinaryUtils.doReadByteArray(BinaryUtils.java:1241)
at app//org.apache.ignite.internal.binary.BinaryUtils.unmarshal(BinaryUtils.java:1995)
at app//org.apache.ignite.internal.binary.BinaryUtils.unmarshal(BinaryUtils.java:1889)
at app//org.apache.ignite.internal.binary.BinaryUtils.unmarshal(BinaryUtils.java:1880)
at app//org.apache.ignite.internal.binary.BinaryUtils.unmarshal(BinaryUtils.java:1871)
at app//org.apache.ignite.internal.binary.BinaryObjectImpl.fieldByOrder(BinaryObjectImpl.java:572)
at app//org.apache.ignite.internal.binary.BinaryFieldImpl.value(BinaryFieldImpl.java:112)
at app//org.apache.ignite.internal.processors.query.property.QueryBinaryProperty.fieldValue(QueryBinaryProperty.java:229)
at app//org.apache.ignite.internal.processors.query.property.QueryBinaryProperty.value(QueryBinaryProperty.java:121)
at app//org.apache.ignite.internal.cache.query.index.sorted.QueryIndexRowHandler.indexKey(QueryIndexRowHandler.java:93)
at app//org.apache.ignite.internal.cache.query.index.sorted.IndexRowImpl.key(IndexRowImpl.java:67)
at app//org.apache.ignite.internal.cache.query.index.sorted.IndexRowComparatorImpl.compareRow(IndexRowComparatorImpl.java:74)
at app//org.apache.ignite.internal.cache.query.index.sorted.inline.InlineIndexTree.compareFullRows(InlineIndexTree.java:340)
at app//org.apache.ignite.internal.cache.query.index.sorted.inline.InlineIndexTree.compare(InlineIndexTree.java:320)
at app//org.apache.ignite.internal.cache.query.index.sorted.inline.InlineIndexTree.compare(InlineIndexTree.java:80)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.compare(BPlusTree.java:5746)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findInsertionPoint(BPlusTree.java:5666)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$1100(BPlusTree.java:215)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Search.run0(BPlusTree.java:422)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$GetPageHandler.run(BPlusTree.java:6298)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Search.run(BPlusTree.java:402)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$GetPageHandler.run(BPlusTree.java:6284)
at app//org.apache.ignite.internal.cache.query.index.sorted.inline.InlineIndexTree$1$1.run(InlineIndexTree.java:713)
at app//org.apache.ignite.internal.cache.query.index.sorted.inline.InlineIndexTree$1$1.run(InlineIndexTree.java:700)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.readPage(PageHandler.java:174)
at app//org.apache.ignite.internal.processors.cache.persistence.DataStructure.read(DataStructure.java:415)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.read(BPlusTree.java:6503)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.removeDown(BPlusTree.java:2411)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.removeDown(BPlusTree.java:2430)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doRemove(BPlusTree.java:2338)
at app//org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.removex(BPlusTree.java:2145)
at app//org.apache.ignite.internal.cache.query.index.sorted.inline.InlineIndexImpl.remove(InlineIndexImpl.java:406)
at app//org.apache.ignite.internal.cache.query.index.sorted.inline.InlineIndexImpl.onUpdate(InlineIndexImpl.java:359)
at app//org.apache.ignite.internal.cache.query.index.IndexProcessor.updateIndex(IndexProcessor.java:465)
at app//org.apache.ignite.internal.cache.query.index.IndexProcessor.updateIndexes(IndexProcessor.java:308)
at app//org.apache.ignite.internal.cache.query.index.IndexProcessor.remove(IndexProcessor.java:202)
at app//org.apache.ignite.internal.processors.query.GridQueryProcessor.remove(GridQueryProcessor.java:3659)
at app//org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.remove(GridCacheQueryManager.java:453)
at app//org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.finishRemove(IgniteCacheOffheapManagerImpl.java:2693)
at app//org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.remove(IgniteCacheOffheapManagerImpl.java:2670)
at app//org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.remove(IgniteCacheOffheapManagerImpl.java:598)
at app//org.apache.ignite.internal.processors.cache.GridCacheMapEntry.removeValue(GridCacheMapEntry.java:3947)
at app//org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onExpired(GridCacheMapEntry.java:3638)
at app//org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onTtlExpired(GridCacheMapEntry.java:3561)
at app//org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:79)
at app//org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:70)
at app//org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:38)
at app//org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.expireInternal(IgniteCacheOffheapManagerImpl.java:1356)
at app//org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.expire(IgniteCacheOffheapManagerImpl.java:1313)
at app//org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:246)
at app//org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.lambda$body$0(GridCacheSharedTtlCleanupManager.java:199)
at app//org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker$$Lambda$1487/0x000000080082f440.apply(Unknown Source)
at [email protected]/java.util.concurrent.ConcurrentHashMap.computeIfPresent(Unknown Source)
at app//org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:198)
at app//org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
at [email protected]/java.lang.Thread.run(Unknown Source)
[11:20:36,633][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 548 milliseconds.
[11:20:37,234][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 551 milliseconds.
[11:20:37,235][WARNING][sys-stripe-0-#1%ImportedCluster%][FailureProcessor] No deadlocked threads detected.
[11:20:38,232][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 604 milliseconds.
[11:20:38,841][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 559 milliseconds.
[11:20:40,058][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 1163 milliseconds.
[11:20:40,638][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 534 milliseconds.
[11:20:41,222][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 534 milliseconds.
[11:20:41,849][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 577 milliseconds.
[11:20:42,467][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 568 milliseconds.
[11:20:43,688][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 1171 milliseconds.
[11:20:44,267][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 529 milliseconds.
[11:20:44,868][WARNING][sys-stripe-0-#1%ImportedCluster%][FailureProcessor] Thread dump at 2024/11/11 11:20:41 IST
...
This is followed by the thread dump
...
[11:22:16,476][INFO][grid-timeout-worker-#38%ImportedCluster%][IgniteKernal%ImportedCluster]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=a66811bb, name=ImportedCluster, uptime=2 days, 21:00:33.661]
^-- Cluster [hosts=19, CPUs=89, servers=3, clients=16, topVer=31, minorTopVer=0]
^-- Network [addrs=[100.96.17.220, 127.0.0.1], discoPort=47500, commPort=47100]
^-- CPU [CPUs=16, curLoad=100%, avgLoad=3%, GC=44.27%]
^-- Heap [used=7967MB, free=0.88%, comm=8038MB]
^-- Outbound messages queue [size=176559]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=2, idle=14, qSize=0]
^-- Striped thread pool [active=13, idle=3, qSize=993]
[11:22:16,476][INFO][grid-timeout-worker-#38%ImportedCluster%][IgniteKernal%ImportedCluster] FreeList [name=default##FreeList, buckets=256, dataPages=53581, reusePages=7785]
[11:22:17,019][INFO][grid-timeout-worker-#38%ImportedCluster%][IgniteKernal%ImportedCluster]
^-- sysMemPlc region [type=internal, persistence=false, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=99.21%, allocRam=40MB]
^-- default region [type=default, persistence=false, lazyAlloc=true,
... initCfg=256MB, maxCfg=6429MB, usedRam=662MB, freeRam=89.7%, allocRam=667MB]
^-- TxLog region [type=internal, persistence=false, lazyAlloc=false,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=40MB]
^-- volatileDsMemPlc region [type=user, persistence=false, lazyAlloc=true,
... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
[11:22:17,019][INFO][grid-timeout-worker-#38%ImportedCluster%][IgniteKernal%ImportedCluster]
Data storage metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Off-heap memory [used=663MB, free=90.15%, allocated=747MB]
^-- Page memory [pages=168760]
[11:22:17,591][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 2189 milliseconds.
[11:22:17,596][WARNING][tcp-disco-msg-worker-[3628f955 100.96.11.124:47500 crd]-#2%ImportedCluster%-#68%ImportedCluster%][FailureProcessor] Thread dump at 2024/11/11 11:22:15 IST
This is followed by a thread dump
...
[11:22:19,268][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 1094 milliseconds.
[11:22:21,500][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 2715 milliseconds.
[11:22:22,086][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 536 milliseconds.
[11:22:23,693][WARNING][jvm-pause-detector-worker][IgniteKernal%ImportedCluster] Possible too long JVM pause: 1557 milliseconds.
[11:22:23,694][WARNING][tcp-disco-msg-worker-[3628f955 100.96.11.124:47500 crd]-#2%ImportedCluster%-#68%ImportedCluster%][CacheDiagnosticManager] Page locks dump:
Thread=[name=sys-stripe-0-#1%ImportedCluster%, id=31], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-0-#1%ImportedCluster% time=(1731304339806, 2024-11-11 11:22:19.806)
-> Try Read lock nextOpPageId=844420635164855, nextOpStructureId=AuthDetailsCache##PendingEntries [pageIdHex=0002ffff000000b7, partId=65535, pageIdx=183, flags=00000002]
Thread=[name=sys-stripe-5-#6%ImportedCluster%, id=36], state=RUNNABLE
Locked pages = [5693394349899988095[4f02ffff10013c7f](r=1|w=0)]
Locked pages log: name=sys-stripe-5-#6%ImportedCluster% time=(1731304339806, 2024-11-11 11:22:19.806)
L=1 -> Read lock pageId=5693394349899988095, structureId=-863187587_MESSAGE_DATA_RAW_DATA##H2Tree [pageIdHex=4f02ffff10013c7f, partId=65535, pageIdx=268516479, flags=00000002]
Thread=[name=sys-stripe-6-#7%ImportedCluster%, id=37], state=RUNNABLE
Locked pages = [844420635164855[0002ffff000000b7](r=0|w=1)]
Locked pages log: name=sys-stripe-6-#7%ImportedCluster% time=(1731304339806, 2024-11-11 11:22:19.806)
L=1 -> Write lock pageId=844420635164855, structureId=AuthDetailsCache##PendingEntries [pageIdHex=0002ffff000000b7, partId=65535, pageIdx=183, flags=00000002]
Thread=[name=ttl-cleanup-worker-#103%ImportedCluster%, id=161], state=RUNNABLE
Locked pages = [844420903612501[0002ffff10003055](r=1|w=0)]
Locked pages log: name=ttl-cleanup-worker-#103%ImportedCluster% time=(1731304339806, 2024-11-11 11:22:19.806)
L=1 -> Read lock pageId=844420903612501, structureId=dupCheckCache-p-227##CacheData [pageIdHex=0002ffff10003055, partId=65535, pageIdx=268447829, flags=00000002]
And then the JVM restarts.
问题是,尽管所有缓存都在堆外,但为什么会出现高堆利用率。在此期间,缓存中的条目数量很可能会显着增加。这些条目有一个大小为 6KB 的 BLOB。很难说有多少条目。
未生成堆转储,因为未引发内存不足异常,并且节点在此异常之前重新启动。
没有为该节点配置 Xmx。
虽然您的数据存储在堆外,但 Ignite 仍然是一个 Java 应用程序,需要堆空间才能工作。当然,对于大多数用例来说,跨 19 个节点的 8Gb 是不够的。
如果您使用
JVM_OPTS
启动服务器,则可以使用 ignite.sh
环境变量配置 JVM。 文档中建议的默认值是:
-server
-Xms10g
-Xmx10g
-XX:+AlwaysPreTouch
-XX:+UseG1GC
-XX:+ScavengeBeforeFullGC
-XX:+DisableExplicitGC
您没有使用持久性,因此每个服务器节点 10Gb 可能太多了。我建议从 4Gb 左右开始。