简体   繁体   English

将Document添加到索引后忘记关闭Lucene IndexWriter

[英]Forgot to close the Lucene IndexWriter after adding Documents to the index

I had a program running for 2 days to build a Lucene index for around 160 million text files, and after the program ended, I tried searching the index and found the index was not correctly built, indexReader.numDocs() returned 0. I checked the index directory, it looked good, all the index data seemed to be there, the directory is 1.5 Gigabytes in size. 我有一个程序运行2天来为大约1.6亿个文本文件构建一个Lucene索引,在程序结束后,我尝试搜索索引并发现索引没有正确构建,indexReader.numDocs()返回0.我检查过索引目录,看起来不错,所有索引数据似乎都在那里,目录大小为1.5千兆字节。

I checked my code and found that I forgot to call indexWriter.optimize() and indexWriter.close(), I want to know if it is possible to re-optimize() the index so I don't need to rebuild the whole index from scratch? 我检查了我的代码,发现我忘了调用indexWriter.optimize()和indexWriter.close(),我想知道是否有可能重新优化()索引,所以我不需要重建整个索引从头开始? I don't really want the program to take another 2 days. 我真的不希望该计划再花2天时间。

Calling IndexWriter.optimize() is not necessary and can be called at a later time by reopening the index. 调用IndexWriter.optimize()不是必需的,可以稍后通过重新打开索引来调用。 It just optimizes the documents in the index for better read performance and doesn't otherwise affect anything. 它只是优化索引中的文档以获得更好的读取性能,并且不会影响任何内容。

If you forgot to call IndexWriter.close() however then your index might not be complete. 如果您忘记调用IndexWriter.close()那么您的索引可能不完整。 Since you processed so many documents it likely flushed most of them, so hopefully you only need to re-index the last ones. 由于您处理了如此多的文档,因此可能会刷新大部分文档,所以希望您只需要重新索引最后一个文档。 Use Luke as suggested for a UI to quickly browse the index to see what state it's in. 使用Luke建议用户界面快速浏览索引以查看其状态。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM