简体   繁体   中英

When to do indexing in lucene

I have REST service which works with data from database (mongodb). I want to add apache lucene library to implement full text search.

I never used Lucene before so trying to understand how it works be checking tutorials, but still one thing is unclear for me:

When to do indexing of DB data? I have DB, some data is added and removed more often, some is updated rarely. What should be structure that I could do search requests by all up to date data.

Should I update indexes on every data update, or it will be done automatically, and enough to index once? If reindexing should be made, so how often?

If you want live data to be searched then you should add, update and delete data in lucene index at the same time you perform add, update and delete data in your database.

It will perfectly fine just for indexing but do not optimize your index for every operation.

You can optimize your index once in a day or according to your use. Optimizing index will help you for faster search result.

Refer this tutorial to just begin with basic application of lucene.

You can try MongoDBs own Feature for this (see Mongo Docs ). This has probably not the flexibility and is not as mighty as Lucene, but it Comes for free.

You really asked the problematic question: "When do indexing?". And the answer depends heavy on your requirements. However, you can look at this post to see how it is technically done: offline, ie you will always be more or less behind in indexing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM