简体   繁体   English

Solr“实时”索引

[英]Solr “real time” indexing

I know there are several questions similar to this but they don't provide a simple answer to the problem at hand.我知道有几个与此类似的问题,但它们并没有为手头的问题提供简单的答案。 Sorry if you feel this is a duplicate but I think clear and understandable answer would benefit many.抱歉,如果您觉得这是重复的,但我认为清晰易懂的答案将使许多人受益。 So, to the question.所以,对于这个问题。

Can Solr indexing updates be automated? Solr 索引更新可以自动化吗? And if they can, what would be the optimal way to do it?如果可以的话,最好的方法是什么?

Here is a simple use case to clarify the question: I have a database table with several columns of different kind of data.这是一个澄清问题的简单用例:我有一个数据库表,其中包含几列不同类型的数据。 There is a web app which is used to manage the data.有一个 web 应用程序用于管理数据。 I've got separate Solr server to index specified columns in the above mentioned table.我有单独的 Solr 服务器来索引上述表中的指定列。 How could I achieve an outcome that when users adds, removes or modifies data in the said table, Solr would notice the changed and modify the index.我怎样才能实现当用户在所述表中添加、删除或修改数据时,Solr 会注意到更改并修改索引的结果。

It would be necessary for it to be "real time".它必须是“实时的”。 Meaning that after few seconds the changes would take place.这意味着几秒钟后就会发生变化。 Of course with large amount of data it can be more.当然,对于大量数据,它可以更多。

Thanks in advance提前致谢

There are two questions here:这里有两个问题:

Can Solr indexing updates be automated? Solr 索引更新可以自动化吗?

Yes they can, and they should be always automated.是的,它们可以,而且它们应该始终是自动化的。 You don't want to manually launch the indexing process for every change.您不想为每次更改手动启动索引过程。

It would be necessary for it to be "real time".它必须是“实时的”。

I already mentioned some ways to reduce latency between changed data and updating the index in this answer .我已经在这个答案中提到了一些减少更改数据和更新索引之间延迟的方法。 You could use autoCommit to make sure that your data is committed within x seconds of the update.您可以使用autoCommit确保您的数据在更新后的 x 秒内提交。 Depending on the interval, you'd want to reduce autowarming and adjust other settings, see this for more details.根据时间间隔,您可能希望减少自动升温并调整其他设置,有关详细信息,请参阅内容。

Also keep an eye on the NRT wiki page for related information and solutions about this.还请留意NRT wiki 页面,了解相关信息和解决方案。

You may want to take a look at Apache Solr 3.3 with RankingAlgorithm 1.2.您可能想看看 Apache Solr 3.3 和 RankingAlgorithm 1.2。 It supports NRT (Near Real Time Indexing) and can update 10,000 docs / sec.它支持 NRT (Near Real Time Indexing) 并且可以更新 10,000 文档/秒。 You can concurrently search during the updates.您可以在更新期间同时搜索。 You do not need to commit or close the searchers.您不需要提交或关闭搜索器。 You can get more information about NRT with Solr 3.3 with RankingAlgorithm from here:您可以从此处获取有关使用 Solr 3.3 和 RankingAlgorithm 的 NRT 的更多信息:

http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_3.x http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_3.x

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM