通过文本字段聚合和排序，并在Elasticsearch中串联其他文本字段

Question

In Elasticsearch, how does one aggregate and sort by a text field and concatenate field values of other text fields, joined by eg ; 在Elasticsearch中，如何通过一个文本字段进行聚合和排序 ，以及如何将其他文本字段的字段值连接起来，例如， ; ? ？

In concatenating I mean to concatenate values of the same field from all the aggregated documents, not values of different fields from the same document. 串联时，我的意思是串联所有汇总文档中同一字段的值， 而不是同一文档中不同字段的值。

Details 细节

I have small documents with fields gene, tag, annotation described as 我有一些带有字段基因，标签，注释的小文档，描述为

{
  "mappings": {
    "annotations": {
      "properties": {
        "species": {
          "type": "text"
        },
        "gene": {
          "type": "text",
          "fields": {
            "keyword": { 
              "type": "keyword"
            }  
          }
        },
        "tag": {
          "type": "text"
        },
        "annotation": {
          "type": "text"
        }
      }
    }
  }
}

There are many entries per gene. 每个基因有很多条目。 That is, I have: 也就是说，我有：

Gene  Tag   Annotation
----- ----- ---------------
A1BG  tag1  first gene
A2M   tag1  a-macroglobulin
A2M   tag2  second gene
BRCA1 tag1  breast cancer 1
BRCA1 tag3  important gene

I want to query these data, aggregate and sort by gene, and get something like this: 我想查询这些数据，按基因进行汇总和排序，然后得到如下结果：

Gene   Tags        Annotations
------ ----------- -------------------------------
A1BG   tag1        first gene
A2M    tag1; tag2  a-macroglobulin; second gene
BRCA1  tag1; tag3  breast cancer 1; important gene

I can not find anything meaningful after googling for more than a day. 谷歌搜索超过一天后，我找不到任何有意义的东西。 Elasticsearch examples mostly show statistics eg counts, a few examples about concatenating fields from the same document but I could not find a way to concatenate the values of the same field. Elasticsearch示例主要显示统计信息（例如计数），还有一些有关连接同一文档中字段的示例，但是我找不到连接同一字段值的方法。 I tried to use map as well as something like this: 我试图使用map以及类似的东西：

{
    "aggs" : {
        "genes_agg" : {
            "terms" : {
                "script" : {
                    "source": "doc['tag'].join('; ')",
                    "lang": "painless"
                }
            }
        }
    }
}

but nothing works. 但没有任何效果。

Answer 1

I think you can't find anything because you're approaching this from a relational database perspective. 我认为您找不到任何东西，因为您是从关系数据库的角度来解决这个问题的。 Elasticsearch is built like a document store so you would basically put all the tags, annotations, etc for BRCA1 in one document. Elasticsearch的构建就像一个文档存储，因此您基本上可以将BRCA1所有标签，注释等放入一个文档中。 I think you need to rethink your indexing strategy, not your querying strategy. 我认为您需要重新考虑索引策略，而不是查询策略。

通过文本字段聚合和排序，并在Elasticsearch中串联其他文本字段

问题描述

1 个解决方案

解决方案1
0 2017-12-26 09:33:08

通过文本字段聚合和排序，并在Elasticsearch中串联其他文本字段

问题描述

1 个解决方案

解决方案1 0 2017-12-26 09:33:08

解决方案1
0 2017-12-26 09:33:08