簡體   English   中英

Neo4j導入工具-OutOfMemory錯誤:超出了GC開銷限制

[英]Neo4j import tool - OutOfMemory error: GC overhead limit exceeded

我正在使用neo4j-import工具(Windows)導入具有約2000萬個關系的約100萬個節點,所有這些節點都應該是唯一的。 該過程順利進行,直到到達“ Relationship Count”任務為止,該過程一直加載到20M(看似所有關系),但隨后掛起了一段時間(30分鍾至1小時),最終返回了“ java”。 lang.OutOfMemoryError:超出了GC開銷限制”。

我之前已經成功加載了大型圖形數據庫(39M個節點,21M個關系),所以我不確定問題出在哪里。 是因為與我加載的以前的數據庫相比,圖形數據庫的連接更加緊密嗎?

還是會發生內存泄漏? 在我的任務管理器中,Java平台SE Binary進程在導入負載時(尤其是在即將結束時)需要越來越大的內存量(在16GB RAM中高達12-13GB)。 這似乎非常大,特別是因為39M節點/ 21M關系圖數據庫能夠使用導入工具相對快速地成功導入(沒有掛在關系計數上)。

關於可能出了什么問題的任何想法? 提前致謝!

如果它有助於查看我的節點/關系文件,請使用以下鏈接: https : //drive.google.com/open?id=0Bw7N-SlJA3ZCei0ycEhoa2YwNUU

這是neo4j shell輸出:

C:Users\Username\Documents\Neo4j>neo4jImport -into graphDB1.graphdb --nodes D:\concept.csv --relationships D:\predicate.csv --stacktrace --idtype integer
WARNING! This batch script has been deprecated. Please use the provided PowerShell scripts instead: http://neo4j.com/docs/stable/powershell.html
The system cannot find the path specified.
Importing the contents of these files into graphDB1.graphdb:
Nodes:
  D:\concept.csv
Relationships:
      D:\predicate.csv

Available memory:
  Free machine memory: 13.50 GB
  Max heap memory : 12.75 GB

    Nodes
[>:|PR|NOD|*LABEL SCAN---------------------------------|v:6.79 MB/s----------------------------]  1M
Done in 40s 562ms
Prepare node index
[*DETECT:20.37 MB------------------------------------------------------------------------------]  1M
Done in 802ms
Calculate dense nodes
[*>:59.38 MB/s----------------------------------|PREPARE(3)====================================] 20M
Done in 12s 566ms
Relationships
[>:2.01 |PREPARE-----------|P|RELATIONSHI|*v:4.05 MB/s-----------------------------------------] 20M
Done in 6m 3s 655ms
Node --> Relationship
[>:3.19 MB/s--------------------------|L|*v:2.39 MB/s------------------------------------------]  1M
Done in 8s 421ms
Relationship --> Relationship
[*>:6.82 MB/s--------------------------------------|LINK-----------|v:6.82 MB/s----------------] 20M
Done in 1m 36s 849ms
Node counts
[*COUNT:91.55 MB-------------------------------------------------------------------------------]  1M
Done in 3m 35s 21ms
Relationship counts
[*>:8.62 MB/s-----------------------------------------------------------|COUNT-----------------] 20MException in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.Arrays.copyOf(Unknown Source)
        at java.util.ArrayList.toArray(Unknown Source)
        at java.util.ArrayList.<init>(Unknown Source)
        at org.neo4j.unsafe.impl.batchimport.stats.StepStats.<init>(StepStats.java:39)
        at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.stats(AbstractStep.java:220)
        at org.neo4j.unsafe.impl.batchimport.staging.StageExecution$1.compare(StageExecution.java:123)
        at org.neo4j.unsafe.impl.batchimport.staging.StageExecution$1.compare(StageExecution.java:118)
        at java.util.TimSort.countRunAndMakeAscending(Unknown Source)
        at java.util.TimSort.sort(Unknown Source)
        at java.util.TimSort.sort(Unknown Source)
        at java.util.Arrays.sort(Unknown Source)
        at java.util.Collections.sort(Unknown Source)
        at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stepsOrderedBy(StageExecution.java:117)
        at org.neo4j.unsafe.impl.batchimport.staging.DynamicProcessorAssigner.assignProcessorsToPotentialBottleNeck(DynamicProcessorAssigner.java:94)
        at org.neo4j.unsafe.impl.batchimport.staging.DynamicProcessorAssigner.check(DynamicProcessorAssigner.java:81)
        at org.neo4j.unsafe.impl.batchimport.staging.MultiExecutionMonitor.check(MultiExecutionMonitor.java:106)
        at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:65)
        at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseExecution(ExecutionSupervisors.java:80)
        at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:224)
        at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:185)
        at org.neo4j.tooling.ImportTool.main(ImportTool.java:363)
        at org.neo4j.tooling.ImportTool.main(ImportTool.java:279)

更新1:

這是導入在關系計數時掛起的那一刻的線程轉儲:

2016-02-17 08:28:12
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):

"MuninnPageCache[1]-FlushTask" daemon prio=6 tid=0x0000000026855800 nid=0xfe0 waiting on condition [0x00000000288fe000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000004c0189810> (a org.neo4j.io.pagecache.impl.muninn.MuninnPageCache)
        at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
        at org.neo4j.io.pagecache.impl.muninn.MuninnPageCache.continuouslyFlushPages(MuninnPageCache.java:909)
        at org.neo4j.io.pagecache.impl.muninn.FlushTask.run(FlushTask.java:36)
        at org.neo4j.io.pagecache.impl.muninn.BackgroundTask.run(BackgroundTask.java:45)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

"MuninnPageCache[1]-EvictionTask" daemon prio=6 tid=0x0000000026904000 nid=0x3bd4 runnable [0x00000000287fe000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000004c0189810> (a org.neo4j.io.pagecache.impl.muninn.MuninnPageCache)
        at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
        at org.neo4j.io.pagecache.impl.muninn.MuninnPageCache.parkEvictor(MuninnPageCache.java:697)
        at org.neo4j.io.pagecache.impl.muninn.MuninnPageCache.parkUntilEvictionRequired(MuninnPageCache.java:751)
        at org.neo4j.io.pagecache.impl.muninn.MuninnPageCache.continuouslySweepPages(MuninnPageCache.java:732)
        at org.neo4j.io.pagecache.impl.muninn.EvictionTask.run(EvictionTask.java:39)
        at org.neo4j.io.pagecache.impl.muninn.BackgroundTask.run(BackgroundTask.java:45)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

"Service Thread" daemon prio=6 tid=0x0000000024ee8000 nid=0x301c runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x0000000024ee6000 nid=0x3060 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x0000000024ee2800 nid=0x2198 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Attach Listener" daemon prio=10 tid=0x0000000024ee2000 nid=0x1ae4 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x0000000024ee1000 nid=0x135c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=8 tid=0x0000000024ed9000 nid=0x3480 in Object.wait() [0x00000000278ff000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000004c000d4b0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(Unknown Source)
        - locked <0x00000004c000d4b0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(Unknown Source)
        at java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source)

"Reference Handler" daemon prio=10 tid=0x0000000024ed8000 nid=0x1ae8 in Object.wait() [0x00000000277ff000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000004c000d300> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.ref.Reference$ReferenceHandler.run(Unknown Source)
        - locked <0x00000004c000d300> (a java.lang.ref.Reference$Lock)

"main" prio=6 tid=0x00000000023c2800 nid=0x2e7c waiting on condition [0x00000000023bf000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.neo4j.io.fs.FileUtils.waitAndThenTriggerGC(FileUtils.java:253)
        at org.neo4j.io.fs.FileUtils.deleteFile(FileUtils.java:110)
        at org.neo4j.io.fs.DefaultFileSystemAbstraction.deleteFile(DefaultFileSystemAbstraction.java:127)
        at org.neo4j.kernel.impl.storemigration.FileOperation$3.perform(FileOperation.java:93)
        at org.neo4j.kernel.impl.storemigration.StoreFile.fileOperation(StoreFile.java:267)
        at org.neo4j.tooling.ImportTool.main(ImportTool.java:389)
        at org.neo4j.tooling.ImportTool.main(ImportTool.java:279)

"VM Thread" prio=10 tid=0x0000000024ed1800 nid=0x3058 runnable

"GC task thread#0 (ParallelGC)" prio=6 tid=0x00000000023d7000 nid=0x313c runnable

"GC task thread#1 (ParallelGC)" prio=6 tid=0x00000000023d9000 nid=0x3144 runnable

"GC task thread#2 (ParallelGC)" prio=6 tid=0x00000000023da800 nid=0x974 runnable

"GC task thread#3 (ParallelGC)" prio=6 tid=0x00000000023dc000 nid=0x3a3c runnable

"GC task thread#4 (ParallelGC)" prio=6 tid=0x00000000023de800 nid=0x3684 runnable

"GC task thread#5 (ParallelGC)" prio=6 tid=0x00000000023e1000 nid=0x35b8 runnable

"GC task thread#6 (ParallelGC)" prio=6 tid=0x00000000023e4000 nid=0x3950 runnable

"GC task thread#7 (ParallelGC)" prio=6 tid=0x00000000023e5800 nid=0x318c runnable

"GC task thread#8 (ParallelGC)" prio=6 tid=0x00000000023e8800 nid=0x30b8 runnable

"GC task thread#9 (ParallelGC)" prio=6 tid=0x00000000023e9800 nid=0x32dc runnable

"VM Periodic Task Thread" prio=10 tid=0x0000000024eed800 nid=0x3710 waiting on condition

JNI global references: 377

Heap
 PSYoungGen      total 2071552K, used 0K [0x0000000780000000, 0x0000000800000000, 0x0000000800000000)
  eden space 2043904K, 0% used [0x0000000780000000,0x0000000780000000,0x00000007fcc00000)
  from space 27648K, 0% used [0x00000007fe500000,0x00000007fe500000,0x0000000800000000)
  to   space 25600K, 0% used [0x00000007fcc00000,0x00000007fcc00000,0x00000007fe500000)
 ParOldGen       total 11534336K, used 10982258K [0x00000004c0000000, 0x0000000780000000, 0x0000000780000000)
  object space 11534336K, 95% used [0x00000004c0000000,0x000000075e4dcb50,0x0000000780000000)
 PSPermGen       total 21504K, used 13521K [0x00000004bae00000, 0x00000004bc300000, 0x00000004c0000000)
  object space 21504K, 62% used [0x00000004bae00000,0x00000004bbb34588,0x00000004bc300000)

2016-02-17 08:28:20

在這么小的數據集上,這很奇怪。 您希望此數據集中有多少個獨特的關系和標簽? 還可以在發生暫停時以某種方式提供線程轉儲嗎?

編輯:問題是包含屬性值的列用作LABEL 這會錯誤地產生大量標簽,並且計數不會隨之增加。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM