简体   繁体   English

使用适当的更新重新索引 elasticsearch 中的文档

[英]Reindexing documents in elasticsearch with proper updates

I need to re-index all my documents to a new index with updated mappings and a different index settings such as number of shards.我需要使用更新的映射和不同的索引设置(例如分片数)将所有文档重新索引到新索引。

The events are published in a Kafka topic and then consumed by a service which push that event to elastic search.这些事件在 Kafka 主题中发布,然后由将事件推送到弹性搜索的服务使用。 So, I don't want to stop consuming the events while re-indexing.所以,我不想在重新索引时停止使用事件。

To achieve this, I have kept primaryIndex (name of the old index) and secondaryIndex (name of the new index) in application.properties of a spring app.为此,我在 spring 应用程序的application.properties中保留了primaryIndex (旧索引的名称)和secondaryIndex (新索引的名称)。 So while indexing document, application will write the events to both indices (primary and secondary) and read from primary index only.因此,在索引文档时,应用程序会将事件写入两个索引(主索引和辅助索引)并仅从主索引读取。 Now I will run _reindex API to move documents from old index to a new index.现在我将运行_reindex API 将文档从旧索引移动到新索引。 As re-indexing will last for about 4-5 days, an event may get overridden by the _reindex API which I want to avoid.由于重新索引将持续大约 4-5 天,因此我想避免的_reindex API 可能会覆盖事件。

How can I ensure my documents are not being overridden by _reindex API?如何确保我的文档不会被_reindex API 覆盖?

Once re-indexing is done, I can remove secondary index from my application properties and will replace primaryIndex with new index name and then reading part can also be done from the new index.重新索引完成后,我可以从我的应用程序属性中删除二级索引,并将primaryIndex替换为新的索引名称,然后也可以从新索引中读取部分。

Or is there any better approach to achieve the same?或者有没有更好的方法来实现同样的目标?

You can instruct _reindex API to move documents to new index only when it is not present in the new index.您可以指示_reindex API 仅当新索引中不存在文档时才将其移动到新索引。 If a document is already present in new index, that can either be a new event or an update event which you don't want to get overridden.如果文档已经存在于新索引中,那么它可以是新事件或更新事件,您不想被覆盖。

You can give op_type: 'create' in the reindex API.您可以在重新索引 API 中给出op_type: 'create' For more info, please follow the link https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html欲了解更多信息,请点击链接https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

Hope this answers your question:)希望这能回答你的问题:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Spring 数据 Elasticsearch 文档未被反序列化 - Spring Data Elasticsearch documents not being deserialized 从 java 对象 (@Documents) 为 mongoimport 创建正确的 json - Creating proper json from java objects (@Documents) for mongoimport 带有千分尺Elasticsearch注册表的Spring Boot仅索引空文档 - Spring boot with micrometer Elasticsearch registry indexes only empty documents Spring 数据 elasticsearch 带有 Pageable 的存储库仅重新调整 10000 个文档 - Spring data elasticsearch repository with Pageable is retuning only 10000 documents 如何使用 Spring elasticsearch 合并来自两个不同索引的文档 - How to merge documents from two different indexes using Spring elasticsearch 通过查询更新ElasticSearch 1.7(Spring Data ElasticSearch)需要花费大量时间来更新文档 - ElasticSearch 1.7 (Spring Data ElasticSearch) update by query takes lot of time to update documents 从 spring-data-elasticsearch 4.2.1 升级到 4.3.0 后相同文档的不同分数 - Different scores for identical documents after upgrading from spring-data-elasticsearch 4.2.1 to 4.3.0 EnableReactiveMongoRepositories和ElasticSearch - EnableReactiveMongoRepositories and ElasticSearch Elasticsearch和Jhipster - Elasticsearch and jhipster Mongo Db 中的更新 - Updates in Mongo Db
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM