简体   繁体   中英

How often to call commit on an offline Solr/Lucene index?

I know there have been some semi-similar questions, but in this case, I am building an index which is offline, until build is complete. I am building from scratch two cores, one has about 300k records with alot of citation information and large blocks of full text (this is the document index) and another core which has about 6.6 Million records, with full text (this is the page index).

Given this index is being built offline, the only real performance issue is speed of building. Noone should be querying this data.

The auto-commit would apparently fire if I stop adding items for 50 seconds? Which I don't do. I am adding ten at a time and they are added every couple seconds.

So, should I commit more often? I feel like the longer this runs the slower it gets, at least in my test case of 6k documents to index.

With noone searching this index, how often would anyone suggest I commit?

Should say I am using Solr 3.1 and SolrNet.

Although it's commits that are taking time for you, you might want to consider looking into other tweaking than commit frequency.

Is it the indexing core that also does searching, or is it replicated somewhere else after indexing concludes? If the latter is the case, then turning off caches might have a very noticeable impact on performance ( solr rebuilds caches every time you commit ).

You could also look into using the autoCommit or commitWith features of Solr.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM