简体   繁体   English

Elasticsearch Rails 批量索引

[英]Elasticsearch Rails Bulk indexing

I have around 1.5M data in postgres database that I need to reindex.我在 postgres 数据库中有大约 1.5M 的数据需要重新索引。 I used ActiveRecord's find_each method in one sidekiq worker to pass those objects to another worker that does reindexing for each object.我在一个 sidekiq worker 中使用 ActiveRecord 的 find_each 方法将这些对象传递给另一个为每个对象重新编制索引的 worker。

worker1

# Perform in batch of 200 in 2 minutes.
type.find_each(batch_size: 200) do |object|
    Elasticsearch::Worker2.perform_in(2.minutes, :index, type, object.id, "new_index_name")
end

worker2工人2

def index_object(object, index_name)
  object.__elasticsearch__.index_document(index: index_name)
end

But I ran into the following issue:但我遇到了以下问题:

[429] {"code":429,"message":"Concurrent request limit exceeded. Please consider batching your requests, or contact support@bonsai.io for help."}

Anyone have idea how to do batch requests using elasticsearch rails?任何人都知道如何使用 elasticsearch rails 进行批量请求?

According to Bonsai FAQ they:根据盆景常见问题解答,他们:

We limit the number of concurrent requests.我们限制并发请求的数量。 In practice, the actual requests per second this allows is based on the speed of the requests you are executing.实际上,这允许的每秒实际请求数取决于您正在执行的请求的速度。 Request limits vary at different plan levels.请求限制因不同的计划级别而异。 We are still making changes and measuring real-world limits to determine sensible plan defaults.我们仍在进行更改并衡量现实世界的限制以确定合理的计划默认值。 Rate-limited requests will fail with a HTTP 429 error indicating that you contact us so that we can work with you to accommodate your usage.限速请求将失败并出现 HTTP 429 错误,表明您联系我们以便我们可以与您合作以适应您的使用。 bonsai FAQ盆景常见问题

So you can either increase your usage (paying I would guess) or you can batch your requests below their limit of 1 update per second, ElasticSearch directly provides you with a bulk API which would be a good alternative for you, as you are already using elasticsearch-rails gem you can take advantage of the integration, this article has a good example I have used to index records with elastic-rails in the past bulk_index所以你可以增加你的使用量(我猜是付费的)或者你可以将你的请求批处理到低于每秒 1 次更新的限制,ElasticSearch 直接为你提供了一个批量 API ,这对你来说是一个很好的选择,因为你已经在使用您可以利用集成的 elasticsearch-rails gem,这篇文章有一个很好的示例,我过去曾使用 elastic-rails 对记录进行索引bulk_index

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM