简体   繁体   English

Lucene updateDocument比删除然后添加文档更快吗?

[英]Is Lucene updateDocument faster than deleting and then adding document?

I have a large index (about 100 GB) and I want to update documents in the index frequently. 我有一个大索引(约100 GB),我想经常更新索引中的文档。 I'm in doubt between 2 methods: 我对两种方法有疑问:

1) Updating the document 1)更新文档

2) Deleting the document and adding the updated version 2)删除文档并添加更新版本

Which one would be faster? 哪一个会更快? Is there any other pros and cons!? 还有其他优点和缺点!?

Regarding the Lucene API documentation, there should be no difference between updating a document or removing the old and adding the new one. 关于Lucene API文档,更新文档或删除旧文档和添加新文档之间应该没有区别。 Internally updating causes a remove and add operation: 内部更新会导致删除和添加操作:

In either case, documents are added with addDocument and removed with deleteDocuments(Term) or deleteDocuments(Query). 在任何一种情况下,都会使用addDocument添加文档,并使用deleteDocuments(Term)或deleteDocuments(Query)删除文档。 A document can be updated with updateDocument (which just deletes and then adds the entire document). 可以使用updateDocument更新文档(只删除然后添加整个文档)。 When finished adding, deleting and updating documents, close should be called. 完成添加,删除和更新文档后,应调用close。 ( http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/index/IndexWriter.html ) http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/index/IndexWriter.html

If you can batch your deletes and adds, the best practice is to first make all deletes and then do all adds. 如果您可以批量删除和添加,最佳做法是首先进行所有删除,然后进行所有添加。 My tests on large indices proved that to me. 我对大型指数的测试证明了这一点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM