简体   繁体   English

考虑分数的Elasticsearch汇总

[英]Aggregation on elasticsearch considering the score

I have a document with the columns: 我有一个带有列的文档:

  • fulltext column called 'content' 全文栏称为“内容”
  • employees (one or more) 员工(一个或多个)
  • ... ...

I made this query "Michael Seam vacation ", so elasticsearch returned a thousand of results. 我将此查询设为“ Michael Seam Vacation ”,因此elasticsearch返回了数千个结果。

The results of the query are OK. 查询结果正常。 First I received the vacations of Michael Seam, and then, results of vacations for the others employees 首先,我收到了Michael Seam的假期,然后是其他员工的假期结果

In the results, there are contents with the term "vacation" for dozens of the employees, like: 结果中,有数十名员工使用“休假”一词表示内容,例如:

  • Michael Seam Porter (1 hits) 迈克尔·塞姆·波特(1点击)
  • Michael Seam Carl (3 hits) 迈克尔·塞姆·卡尔(3点击)
  • Lucas (30 hits) 卢卡斯(30点击)
  • Maria Fuch (27 hits) 玛丽亚·福奇(27)
  • Jose White (15 hits) 何塞·怀特(15点击)
  • ... ...

When I put an aggregation for the employee column, I received Lucas, Maria and others before Michael Seam Porter, and Michael Seam Carl, and sometimes they even appear because of the aggregation size. 当我为员工栏添加汇总时,我在Michael Seam Porter和Michael Seam Carl之前收到了Lucas,Maria和其他人,有时由于汇总的大小,它们甚至出现了。

How can I show (in the aggregation) only the employees that are contained in the query ? 如何仅显示(汇总)查询中包含的员工? Is it possible? 可能吗?

PS: I'm working with ES 1.7.5 PS:我正在使用ES 1.7.5

I found one way to do it. 我找到了一种方法。

"aggregatePerEmployee" : {
    "terms" : {
        "field" : "employee.raw",
        "order": {
            "top_hit": "desc"
        },
        "size" : 4
    },
    "aggs": {
        "top_hit" : {
            "max": {
                "script": "_score"
            }
        }
    }
}

With this, the aggregation order consider the top score of each employee. 这样,汇总顺序将考虑每个员工的最高得分。

See the results 查看结果

"aggregatePerEmployee": {
  "doc_count_error_upper_bound": -1,
  "sum_other_doc_count": 1145,
  "buckets": [
    {
      "key": "Michael Seam Carl",
      "doc_count": 3,
      "top_hit": {
        "value": 2.097010612487793
      }
    },
    {
      "key": "Michael Seam Porter ",
      "doc_count": 1,
      "top_hit": {
        "value": 2.0433993339538574
      }
    },
    {
      "key": "Lucas",
      "doc_count": 30,
      "top_hit": {
        "value": 2.0033993339538574
      }
    },
    {
      "key": "Jose White ",
      "doc_count": 15,
      "top_hit": {
        "value": 1.5995635986328125
      }
    }
  ]
}

PS: to enable this groovy script, is necessary to edit the elasticsearch.yml and put this line on it: PS:要启用此常规脚本,有必要编辑elasticsearch.yml并在其上添加以下行:

script.engine.groovy.inline.aggs: on

after that, restart your elasticsearch node 之后,重新启动您的elasticsearch节点

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM