简体   繁体   English

Solr完全导入性能

[英]Solr full-import performance

I have a small set of queries and entities and even though the performance is pretty bad, I just would like to know what tricks and configurations that i can do to increase the performance ? 我有少量的查询和实体,即使性能很差,我也想知道我可以采取哪些技巧和配置来提高性能?

Note I'm using Solr 4.1. 注意我正在使用Solr 4.1。

You should try to minimize the number of commits during your import. 您应该在导入过程中尽量减少提交的次数。 Even if you don't commit periodically when adding documents to Solr, Solr will do an auto commit based on solrconfig.xml autoCommit settings: 即使在将文档添加到Solr时不定期提交,Solr也会基于solrconfig.xml autoCommit设置执行自动提交:

<autoCommit>
   <maxDocs>10000</maxDocs>
   <maxTime>15000</maxTime>
   <openSearcher>false</openSearcher>
</autoCommit>

Increase both maxDocs and maxTime and see if you get better speeds. 增加maxDocsmaxTime并查看速度是否更好。 ( maxTime is in milli seconds, so default setting is 15 secs only, which is very low for bulk imports.) maxTime以毫秒为单位,因此默认设置仅为15秒,对于批量导入而言,这是非常低的。)

You can even try disabling auto-commit during your bulk import and issue one commit command after all your documents are added. 您甚至可以尝试在批量导入期间禁用自动提交,并在添加所有文档后发出一个提交命令。 If this does not throw an out-of-memory exception from Solr, it is the best speed you can get. 如果这没有引发Solr的内存不足异常,则它是您可以获得的最佳速度。

If you were doing an RDBMS import, then I would have suggested capturing as many fields as possible using JOINs and minimizing the number of sub-entities, since each sub-entity opens a separate connection to the DB. 如果要进行RDBMS导入,那么我建议使用JOIN捕获尽可能多的字段,并最小化子实体的数量,因为每个子实体都会打开一个与数据库的单独连接。 Since you are importing from mongo, this doesn't apply to you. 由于您是从mongo导入的,因此这不适用于您。 You can experiment by creating a new mongo collection with all the data you need for Solr, keep a single entity in your data importer and see if it improves import speed. 您可以通过创建一个新的mongo集合进行实验,该集合包含Solr所需的所有数据,在数据导入器中保留一个实体,并查看它是否可以提高导入速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM