简体   繁体   English

Ruby on Rails-Tire-elasticsearch,如何对要导入的数据进行排序?

[英]Ruby on Rails - Tire - elasticsearch, how to sort data being imported?

we are still using (re)Tire for elasticsearch in our production system. 我们仍在生产系统中使用(re)Tire进行弹性搜索。
My question is how to run import task for index and providing it with sorted data from mongoid. 我的问题是如何为索引运行导入任务,并为它提供来自mongoid的排序数据。 Eg. 例如。 I want to index 90 million records while the system is accessible by users, and I would like to achieve that the last inserted records will be indexed first, so that users can find the most recent ones (and most relevant ones - order data by created_at). 我希望在用户可以访问该系统的同时对9000万条记录建立索引,并且我希望实现最后插入的记录将首先被索引,以便用户可以找到最新记录(以及最相关的记录-created_at的订单数据) )。
So is there any way how I can achieve this by using: 所以有什么方法可以通过使用以下方法来实现:
rake environment tire:import CLASS=Document FORCE=true

I would look at manually importing them. 我会看看手动导入它们。 Write your own script, there is an example in the readme something like 编写自己的脚本, 自述文件中有一个示例,例如

User.order('created_at DESC').find_in_batches do |batch|
  Tire.index("users").import batch
end

Or use their import tool: 或使用其导入工具:

User.index.import User.order('created_at DESC')

If your site is widely used I would would be tempted to spin up a different core or instance of solr, import everything and then switch your app to use the new one. 如果您的站点被广泛使用,我会很想启动另一个核心或Solr实例,导入所有内容,然后切换您的应用程序以使用新的。

Hope that helps. 希望能有所帮助。

Edit: 编辑:

If you are using rails, then make sure you include your rails environment in the script (config/environment). 如果使用的是Rails,请确保在脚本(config / environment)中包含rails环境。 If it is going to take a long time to import then you might want to run the script with 'nohup ruby my_script.rb > script_out.log 2>&1 &' so it isn't bound to your ssh session. 如果要花很长时间导入,那么您可能要使用“ nohup ruby​​ my_script.rb> script_out.log 2>&1&”运行脚本,因此它不与ssh会话绑定。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM