简体   繁体   English

批处理Lucene索引

[英]Batch Commit for Lucene Index

I want to index the documents in batches. 我想成批索引文件。 I am setting the IndexWriterConfig.setMaxBufferedDocs() to set the total number of documents in memory before they are committed back in the index. 我正在设置IndexWriterConfig.setMaxBufferedDocs()来设置内存中的文档总数,然后再将它们提交回索引。

Do I have to keep count on the document added and explicitly issue writer.commit() for the index to flush the documents in the memory or the writer will automatically take care of this? 我是否必须依靠添加的文档并为索引显式发出writer.commit()来刷新内存中的文档,否则writer将自动进行处理?

Lucene will actually only flush documents to disk when the ram buffer size or the maximum buffered docs limit is reached (there is no auto-commit). 当达到ram缓冲区大小或最大缓冲区文档限制(没有自动提交)时,Lucene实际上只会将文档刷新到磁盘。 To make them serchable, you will need to actually call IndexWriter.commit and reopen a searcher. 为了使它们易于使用,您将需要实际调用IndexWriter.commit并重新打开搜索器。

当然,Lucene会自己做。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM