简体   繁体   English

如何快速重新索引ElasticSearch?

[英]How to reindex ElasticSearch quickly?

I have an ElasticSearch index with around 200M documents, total index size of 90Gb. 我有一个ElasticSearch索引,包含大约200M文档,总索引大小为90Gb。

I changed mapping, so I would like ElasticSearch to re-index all the documents. 我改变了映射,所以我希望ElasticSearch重新索引所有文档。

I wrote a script that creates a new index (with the new mapping), then goes over all the documents in the old index and puts then into the new one. 我编写了一个创建新索引(使用新映射)的脚本,然后遍历旧索引中的所有文档,然后将其放入新索引中。

It seems to work, but the problem is that it works extremely slowly. 它似乎有效,但问题是它的工作速度非常慢。 It started with 300 documents / minute two days ago, and now the speed is 150 documents/minute. 它开始于两天前的300个文件/分钟,现在速度是150个文件/分钟。

The script runs on a machine within the same network the elastic search machines in. 该脚本在弹性搜索机器所在的同一网络中的机器上运行。

With such speed it will require a month for the re-index to finish. 有了这样的速度,重新索引需要一个月才能完成。

Does anybody know about some faster technique to re-index an elastic search index? 有人知道一些更快的技术来重新索引弹性搜索索引吗?

Answered in the google groups: 在谷歌群组中回答:

Option A: Use bulk index operations. 选项A:使用批量索引操作。

Option B: Use the re-index plug-in that runs inside ES machine: https://github.com/karussell/elasticsearch-reindex 选项B:使用在ES机器内运行的重新索引插件: https//github.com/karussell/elasticsearch-reindex

The proper way how to reindex with Elasticsearch is to use the scan and scroll APIs, which should be supported by Pyes. 如何使用Elasticsearch重新索引的正确方法是使用scanscroll API,Pyes应该支持这些API。

It seems like the Pyes library has a reindex method, but I don't have experience with it. 似乎Pyes库有一个reindex方法,但我没有经验。

(If you'd get over using Ruby over Python :), the Tire Ruby client has a Index#reindex method: https://github.com/karmi/tire/blob/master/test/integration/reindex_test.rb . (如果你过度使用Ruby over Python :), Tire Ruby客户端有一个Index#reindex方法: https//github.com/karmi/tire/blob/master/test/integration/reindex_test.rb It should be fast enough for your data.) 它应该足够快速的数据。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM