简体   繁体   中英

Elasticsearch Rails Bulk indexing

I have around 1.5M data in postgres database that I need to reindex. I used ActiveRecord's find_each method in one sidekiq worker to pass those objects to another worker that does reindexing for each object.

worker1

# Perform in batch of 200 in 2 minutes.
type.find_each(batch_size: 200) do |object|
    Elasticsearch::Worker2.perform_in(2.minutes, :index, type, object.id, "new_index_name")
end

worker2

def index_object(object, index_name)
  object.__elasticsearch__.index_document(index: index_name)
end

But I ran into the following issue:

[429] {"code":429,"message":"Concurrent request limit exceeded. Please consider batching your requests, or contact support@bonsai.io for help."}

Anyone have idea how to do batch requests using elasticsearch rails?

According to Bonsai FAQ they:

We limit the number of concurrent requests. In practice, the actual requests per second this allows is based on the speed of the requests you are executing. Request limits vary at different plan levels. We are still making changes and measuring real-world limits to determine sensible plan defaults. Rate-limited requests will fail with a HTTP 429 error indicating that you contact us so that we can work with you to accommodate your usage. bonsai FAQ

So you can either increase your usage (paying I would guess) or you can batch your requests below their limit of 1 update per second, ElasticSearch directly provides you with a bulk API which would be a good alternative for you, as you are already using elasticsearch-rails gem you can take advantage of the integration, this article has a good example I have used to index records with elastic-rails in the past bulk_index

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM