I understand reindexing using an alias to avoid downtime, as described here: Is there a smarter way to reindex elasticsearch?
But one problem remains: Say the reindexing takes an hour, while the original DB keeps changing. I would need any updates to go to both indexes.
Is there any way to do that?
If not, I would prefer if updates went to the new index, while queries were still served from the old index. But at least in Tire, I haven't seen a way to use different indices for reading and writing. Can that be done?
You can't update two index at the same time from Elasticsearch. You can handle that on your side and 2 index requests to Elasticsearch.
That said, you can probably use alias here althought I'm pretty sure you can search on more than one index using Tire (but I don't know Tire)
You have an old index1
Push all your content to index2 Add an alias index on top of index1, index2
When indexing is finished remove index1
To allow for zero-downtime index changes even as the search system is being updated with new user-generated content you can use the following strategy:
Define aliases for both read and write actions that will point to an ES index. When a Model is updated, look up the model_write alias and use it to write to all tracked indices, which will include both the currently active ones and any that are being built in the background.
class User < ActiveRecord::Base
def self.index_for_search(user_id)
Timeout::timeout(5) do
user = User.find_by_id(user_id)
write_alias = Tire::Alias.find("users_write")
if write_alias
write_alias.indices.each do |index_name|
index = Tire::Index.new(index_name)
if user
index.store user
else
index.remove 'user', user_id
end
end
else
raise "Cannot index without existence of 'users_write' alias."
end
end
end
end
Now, when you want to do a full index rebuild (or initial index creation), add a new index, add it to the alias, and start building it knowing that any active users will be adding their data to both indices simultaneously. Continue to read from the old index until the new one is built, then switch the read alias.
class SearchHelper
def self.set_alias_to_index(alias_name, index_name, clear_aliases = true)
tire_alias = Tire::Alias.find(alias_name)
if tire_alias
tire_alias.indices.clear if clear_aliases
tire_alias.indices.add index_name
else
tire_alias = Tire::Alias.new(:name => alias_name)
tire_alias.index index_name
end
tire_alias.save
end
end
def self.reindex_users_index(options = {})
finished = false
read_alias_name = "users"
write_alias_name = "users_write"
new_index_name = "#{read_alias_name}_#{Time.now.to_i}"
# Make new index for re-indexing.
index = Tire::Index.new(new_index_name)
index.create :settings => analyzer_configuration,
:mappings => { :user => user_mapping }
index.refresh
# Add the new index to the write alias so that any system changes while we're re-indexing will be reflected.
SearchHelper.set_alias_to_index(write_alias_name, new_index_name, false)
# Reindex all users.
User.find_in_batches do |batch|
index.import batch.map { |m| m.to_elasticsearch_json }
end
index.refresh
finished = true
# Update the read and write aliases to only point at the newly re-indexed data.
SearchHelper.set_alias_to_index read_alias_name, new_index_name
SearchHelper.set_alias_to_index write_alias_name, new_index_name
ensure
index.delete if defined?(index) && !finished
end
A post describing this strategy can be found here: http://www.mavengineering.com/blog/2014/02/12/seamless-elasticsearch-reindexing/
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.