简体繁体 English

用Lucene索引时内存不足

[英]Out of memory while indexing with Lucene

原文 2014-07-29 18:08:03 1 1 java/ lucene

I'm using Lucene 4.9.0 to index 23k files, but now I'm receiving java.lang.OutOfMemoryError: Java heap space message . 我正在使用Lucene 4.9.0为23k文件建立索引，但是现在我收到的是java.lang.OutOfMemoryError: Java heap space消息。 I don't want to increase "heap size" because the number of files tends to increase everyday. 我不想增加“堆大小”，因为文件数量每天都在增加。 How can I index all files without the OOM problem and increase "heap space"? 如何在没有OOM问题的情况下索引所有文件并增加“堆空间”？

1 个解决方案

Your question is too vague and makes little sense. 您的问题太含糊，毫无意义。

First of all, 23K files can be 1 byte/each or 1G/each. 首先，23K文件可以是1个字节/每个或1G /每个。 How are we supposed to know what's inside and how heavyweight they are? 我们应该如何知道里面的东西以及它们的重量？

Secondly, you say 其次，你说

I don't want to increase "heap size" because <...> 我不想增加“堆大小”，因为<...>

and straight after you say 说完之后

How can I index all files without the OOM problem and increase "heap space" 如何在没有OOM问题的情况下索引所有文件并增加“堆空间”

Can you make up your mind on whether you can increase heap space or not? 您是否可以决定是否可以增加堆空间？

There's a certain amount of memory required to index the data, and there's nothing much you can do about it. 索引数据需要一定数量的内存，因此您无能为力。 That said, the most memory required is during merging process and you can play with the merge factor to see if this helps you. 也就是说，所需的最大内存是在合并过程中，您可以使用合并因子来查看是否对您有所帮助。