繁体   English   中英

ElasticSearch术语聚合

[英]ElasticSearch term aggregation

我正在尝试使用弹性搜索下面的数据执行术语聚合,并使用以下查询,输出将名称分解为标记(请参阅下面的输出)。 所以我尝试将os_name映射为multi_field,现在我无法通过它进行查询。 是否有可能没有令牌的索引? 比如“Fedora Core”?

查询:

GET /temp/example/_search
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name"
       }
     }
  }
}

数据:

...
    {
        "_index": "temp",
        "_type": "example",
        "_id": "3",
        "_score": 1,
        "_source": {
           "title": "system3",
           "os_name": "Fedora Core",
           "os_version": 18
        }
     },
     {
        "_index": "temp",
        "_type": "example",
        "_id": "1",
        "_score": 1,
        "_source": {
           "title": "system1",
           "os_name": "Fedora Core",
           "os_version": 20
        }
     },
     {
        "_index": "temp",
        "_type": "example",
        "_id": "2",
        "_score": 1,
        "_source": {
           "title": "backup",
           "os_name": "Yellow Dog",
           "os_version": 6
        }
     }
...

输出:

       ...
        {
           "key": "core",
           "doc_count": 2
        },
        {
           "key": "fedora",
           "doc_count": 2
        },
        {
           "key": "dog",
           "doc_count": 1
        },
        {
           "key": "yellow",
           "doc_count": 1
        }
       ...

制图:

PUT /temp
{
  "mappings": {
    "example": {
      "properties": {
        "os_name": {
          "type": "string"
        },
        "os_version": {
          "type": "long"
        },
        "title": {
          "type": "string"
        }
      }
    }
  }
}

实际上你应该像这样改变你的映射

"os_name": {
  "type": "string",
  "fields": {
     "raw": {
        "type": "string",
        "index": "not_analyzed"
     }
  }
},

你的aggs应改为:

GET /temp/example/_search
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name.raw"
       }
     }
  }
}

一个可行的解决方案是将字段设置为not_analyzed (在属性“index”的文档中阅读更多相关信息)。

根据您可能希望设置自定义分析器的要求,此解决方案根本不会分析输入,例如,不分割单词,而是小写它们,以获得不区分大小写的结果。

curl -XDELETE localhost:9200/temp
curl -XPUT localhost:9200/temp -d '
{
  "mappings": {
    "example": {
      "properties": {
        "os_name": {
          "type": "string",
          "index" : "not_analyzed"
        },
        "os_version": {
          "type": "long"
        },
        "title": {
          "type": "string"
        }
      }
    }
  }
}'

curl -XPUT localhost:9200/temp/example/1 -d '
{
    "title": "system3",
    "os_name": "Fedora Core",
    "os_version": 18
}'

curl -XPUT localhost:9200/temp/example/2 -d '
{
    "title": "system1",
    "os_name": "Fedora Core",
    "os_version": 20
}'

curl -XPUT localhost:9200/temp/example/3 -d '
{
    "title": "backup",
    "os_name": "Yellow Dog",
    "os_version": 6
}'

curl -XGET localhost:9200/temp/example/_search?pretty=true -d '
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name"
       }
     }
  }
}'

输出:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "OS" : {
      "buckets" : [ {
        "key" : "Fedora Core",
        "doc_count" : 2
      }, {
        "key" : "Yellow Dog",
        "doc_count" : 1
      } ]
    }
  }
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM