[英]Aggregate and Sort by a text field and concatenate other text fields in Elasticsearch
In Elasticsearch, how does one aggregate and sort by a text field and concatenate field values of other text fields, joined by eg ;
在Elasticsearch中,如何通过一个文本字段进行聚合和排序 ,以及如何将其他文本字段的字段值连接起来,例如,
;
? ?
In concatenating I mean to concatenate values of the same field from all the aggregated documents, not values of different fields from the same document. 串联时,我的意思是串联所有汇总文档中同一字段的值, 而不是同一文档中不同字段的值。
Details 细节
I have small documents with fields gene, tag, annotation described as 我有一些带有字段基因,标签,注释的小文档,描述为
{
"mappings": {
"annotations": {
"properties": {
"species": {
"type": "text"
},
"gene": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"tag": {
"type": "text"
},
"annotation": {
"type": "text"
}
}
}
}
}
There are many entries per gene. 每个基因有很多条目。 That is, I have:
也就是说,我有:
Gene Tag Annotation
----- ----- ---------------
A1BG tag1 first gene
A2M tag1 a-macroglobulin
A2M tag2 second gene
BRCA1 tag1 breast cancer 1
BRCA1 tag3 important gene
I want to query these data, aggregate and sort by gene, and get something like this: 我想查询这些数据,按基因进行汇总和排序,然后得到如下结果:
Gene Tags Annotations
------ ----------- -------------------------------
A1BG tag1 first gene
A2M tag1; tag2 a-macroglobulin; second gene
BRCA1 tag1; tag3 breast cancer 1; important gene
I can not find anything meaningful after googling for more than a day. 谷歌搜索超过一天后,我找不到任何有意义的东西。 Elasticsearch examples mostly show statistics eg counts, a few examples about concatenating fields from the same document but I could not find a way to concatenate the values of the same field.
Elasticsearch示例主要显示统计信息(例如计数),还有一些有关连接同一文档中字段的示例,但是我找不到连接同一字段值的方法。 I tried to use
map
as well as something like this: 我试图使用
map
以及类似的东西:
{
"aggs" : {
"genes_agg" : {
"terms" : {
"script" : {
"source": "doc['tag'].join('; ')",
"lang": "painless"
}
}
}
}
}
but nothing works. 但没有任何效果。
I think you can't find anything because you're approaching this from a relational database perspective. 我认为您找不到任何东西,因为您是从关系数据库的角度来解决这个问题的。 Elasticsearch is built like a document store so you would basically put all the tags, annotations, etc for
BRCA1
in one document. Elasticsearch的构建就像一个文档存储,因此您基本上可以将
BRCA1
所有标签,注释等放入一个文档中。 I think you need to rethink your indexing strategy, not your querying strategy. 我认为您需要重新考虑索引策略,而不是查询策略。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.