简体   繁体   中英

What is the Solr/Lucene process to purge deleted documents in index?

What is the process to purge index when you've got some deleted documents (after a delete by query) in index ?

I'm asking this question because I'm working on a project based on solr and I've noticed a strange behavior and I would like to have some informations about it.

My system got those features :

  • My documents are indexed continuously (1000docs per second)

  • A purge is done every couple of second with this query :

     <delete><query>timestamp_utc:[ * TO NOW-10MINUTES ]</query></delete> 

So I got 600000 documents everytime visible in my index : 10 Minutes * 60 = 600 seconds and speed = 1000docs/s so 600 * 1000 = 600000

But the size of my index increase with the time. And I know that when you do a delete by query the documents are affected by a "delete" label or something like that in the index.

I've seen and tried the attribute "expungeDeletes=true", but I didn't notice a considerable change on my index size.

Any informations about the index purge process would be appreciated.

Thanks.

I know that an optimize can to do this job but it's a long operation and I want to avoid that.

您可以每10分钟创建一个新的集合/核心,切换到该集合/核心(加上前一个),然后删除最早的集合/核心(超过10分钟)。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM