简体   繁体   English

Elasticsearch - 以任何方式找出所有字段值为文本的文档

[英]Elasticsearch - Any way to find out all the documents with field value as text

In the elasticsearch cluster, I accidentally pushed some text in a field which should ideally be a Number.在 elasticsearch 集群中,我不小心将一些文本推送到理想情况下应该是数字的字段中。 Later, I fixed that and pushed the Number type value.后来,我修复了这个问题并推送了 Number 类型的值。 Now, I wanted to fix it such that all the old values can be replaced by some Number for which I need to find out all the documents which are having this field as text.现在,我想修复它,以便所有旧值都可以替换为某个数字,我需要找出所有将此字段作为文本的文档。

Is there any elasticsearch query that I can use to get this information?是否有任何 elasticsearch 查询可用于获取此信息?

I think that can be possible by using a nested aggregations .我认为这可以通过使用嵌套aggregations来实现。

At the top-level;在顶层; use terms aggregation to know text values, at the sub-level;在子级别使用术语聚合来了解文本值; use top_hits aggregation to get documents that includes these values.使用top_hits聚合来获取包含这些值的documents

for instance:例如:

GET example_index/_search
{
  "size": 0,
  "aggs": {
    "NAME": {
      "terms": {
        "field": "example_field.keyword",
        "size": 10
      },
      "aggs": {
        "documents": {
          "top_hits": {
            "size": 10
          }
        }
      }
    }
  }
}

This query;这个查询; will return distinct values of the field, and the related documents in the sub-level, something like:将返回字段的不同值以及子级别中的相关documents ,例如:

{
  "aggregations": {
    "NAME": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "mistake",
          "doc_count": 2,
          "documents": {
            "hits": {
              "total": 2,
              "max_score": 1,
              "hits": [
                {
                  "_index": "example_index",
                  "_type": "example_index",
                  "_id": "2QoDoXEBOCkJkkpwq5P0",
                  "_score": 1,
                  "_source": {
                    "example_field": "mistake"
                  }
                },
                {
                  "_index": "example_index",
                  "_type": "example_index",
                  "_id": "qAoDoXEBOCkJkkpwq5T0",
                  "_score": 1,
                  "_source": {
                    "example_field": "mistake"
                  }
                }
              ]
            }
          }
        },
        {
          "key": "520",
          "doc_count": 2,
          "documents": {
            "hits": {
              "total": 1,
              "max_score": 1,
              "hits": [
                {
                  "_index": "example_index",
                  "_type": "example_index",
                  "_id": "5goDoXEBOCkJkkpwq5P0",
                  "_score": 1,
                  "_source": {
                    "example_field": "1"
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }
}

I the example above;我上面的例子; we need to delete the documents with mistake value, you can simply delete them by id.我们需要删除mistake值的documents ,您可以简单地通过 id 删除它们。

NOTE: if you have a big index, it's rather to write a function inside your code that builds aggregations, gets the response, filters values if it can be parsed to a number, then removes documents by id.注意:如果您有一个大索引,最好在您的代码中编写一个 function 来构建聚合,获取响应,过滤值(如果可以解析为数字),然后按 id 删除文档。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 查找所有弹性搜索文档,其字段为数组 - Find all elasticsearch documents with a field that is an array 如何在Elasticsearch中查找具有特定字段的所有文档? - How to find all documents with specific field in Elasticsearch? 获取 Elasticsearch 中字段值与任何数组元素匹配的所有文档 - Getting all documents in Elasticsearch where a field value matches any of array element 查找字段存在的所有文档:它有一个值,或者它是 Elasticsearch 中的 null - Find all documents where field exists literally: either it has a value or it is null in Elasticsearch ElasticSearch - 将一个字段值复制到所有文档的其他字段 - ElasticSearch - Copy one field value to other field for all documents ElasticSearch:筛选文档,其中数组字段中的任何值都不在列表中 - ElasticSearch: Filter for documents where any value in an array field is not in a list PHP Elasticsearch从索引中的所有文档中获取字段的值 - PHP Elasticsearch GET value of a field from all documents in index Elasticsearch:如何返回字段中具有最高价值的所有文档? - Elasticsearch: How to return all documents that have the highest value in a field? elasticsearch 向所有文档添加字段 - elasticsearch add field to all documents 在Elasticsearch中获取其文本字段仅包含数字值的文档 - Get documents whose text field contains only a number value in Elasticsearch
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM