简体   繁体   English

Logstash:从一个弹性搜索迁移到另一个弹性搜索会导致一些附加属性

[英]Logstash: Migrating from one elastic search to another elastic search result in some additional properties

I have been migrating one of the indexes from self-hosted Elasticsearch to Amazon ElasticSearch using Logstash.我一直在使用 Logstash 将其中一个索引从自托管 Elasticsearch 迁移到 Amazon ElasticSearch。 After successful migration what we found was some additional fields is getting added in the documents.成功迁移后,我们发现文档中添加了一些额外的字段。 How can we prevent it from getting added我们如何防止它被添加

Our Logstash config file我们的 Logstash 配置文件

input {
 elasticsearch {
 hosts => ["https://staing-example.com:443"]
 user => "userName"
 password => "password"
 index => "testingindex"
 size => 100
 scroll => "1m"
 }
}

filter {

}

output {
 amazon_es {
 hosts => ["https://example.us-east-1.es.amazonaws.com:443"]
 region => "us-east-1"
 aws_access_key_id => "access_key_id"
 aws_secret_access_key => "access_key_id"
 index => "testingindex"
}
stdout{
  codec => rubydebug
  }
}

The document in our selfhosted ElasticSearch我们自托管 ElasticSearch 中的文档

{
        "_index": "testingindex",
        "_type": "interaction-3",
        "_id": "38b23e7a-eafd-4163-a9f0-e2d9ffd5d2cf",
        "_score": 1,
        "_source": {
           "customerId" : [
            "e177c1f8-1fbd-4b2e-82b8-760536e42742"
          ],
          "customProperty" : {
            "messageFrom" : [
              "BOT"
            ]
          },
          "userId" : [
            "e177c1f8-1fbd-4b2e-82b8-760536e42742"
          ],
          "uniqueIdentifier" : "2b027fc0-a517-49a7-a71f-8732044cb249",
          "accountId" : "724bee3e-38f8-4538-b944-f3e21c518437"
        }
      }

The document that is in our Amazon ElasticSearch我们的 Amazon ElasticSearch 中的文档

   {
        "_index" : "testingindex",
        "_type" : "doc",
        "_id" : "B-hP020Bd2lcvg9lTyBH",
        "_score" : 1.0,
        "_source" : {
          "customerId" : [
            "e177c1f8-1fbd-4b2e-82b8-760536e42742"
          ],
          "customProperty" : {
            "messageFrom" : [
              "BOT"
            ]
          },
          "@version" : "1",
          "userId" : [
            "e177c1f8-1fbd-4b2e-82b8-760536e42742"
          ],
          "@timestamp" : "2019-10-16T06:44:13.154Z",
          "uniqueIdentifier" : "2b027fc0-a517-49a7-a71f-8732044cb249",
          "accountId" : "724bee3e-38f8-4538-b944-f3e21c518437"
        }
      }

@Version and @Timestamp are the new two fields are getting added in documents @Version 和 @Timestamp 是文档中新添加的两个字段

Can anyone explain why it is getting added is there any other way to prevent this?谁能解释为什么要添加它是否有其他方法可以防止这种情况发生? As you compare both documents the _type and _id also getting changed we need both _type and _id same as our documents in self hosted Elasticsearch当您比较两个文档时, _type_id也发生了变化,我们需要_type_id与我们在自托管 Elasticsearch 中的文档相同

The fields @version and @timestamp are generated by logstash, if you don't want them you will need to use a mutate filter to remove. @version@timestamp字段由 logstash 生成,如果您不想要它们,则需要使用 mutate 过滤器来删除。

mutate {
    remove_fields => ["@version","@timestamp"]
}

To keep the _type and _id of your original documents, you will need to change your input and add the option docinfo => true to get those fields into the @metadata field and use them in your output, the documentation has an example for this.要保留原始文档的_type_id ,您需要更改输入并添加选项docinfo => true以将这些字段放入@metadata字段并在 output 中使用它们, 文档中有一个示例。

input {
    elasticsearch {
        ...
        docinfo => true
    }

output {
    elasticsearch {
        ...
        document_type => "%{[@metadata][_type]}"
        document_id => "%{[@metadata][_id]}"
    }
}

Note that if your Amazon Elasticsearch is version 6.X or higher, you can only have one document type per index, and version 7.X is typeless , also, logstash version 7.X does not have the document_type option anymore.请注意,如果您的 Amazon Elasticsearch 版本为 6.X 或更高版本,则每个索引只能有一种文档类型,并且版本 7.X 是无类型的,此外, logstash版本 7.X 不再具有document_type选项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM