I have one elasticsearch mapping which represents students with a property representing their marks as an array of objects:
properties: {
name: { type: "text" },
/* ... */
marks: {
properties: {
value: { type: "float" }
}
}
}
Based on this mapping, documents are stored in this form:
"hits" : [{
"_index" : "students",
"_type" : "_doc",
"_id" : "...",
"_score" : 1.0,
"_source" : {
"name" : "John Doe",
"marks" : [
{
"_id" : "...",
"value" : 4
},
{
"_id" : "...",
"value" : 0
}
]
}
},
{
"_index" : "students",
"_type" : "_doc",
"_id" : "...",
"_score" : 1.0,
"_source" : {
"name" : "Jane Doe",
"marks" : [
{
"_id" : "...",
"value" : 5
},
{
"_id" : "...",
"value" : 4
}
]
}
}, /* ... */]
Each student has a lot of marks. I would like to get, in the result of elasticsearch, the average of mark's value by student (so by document indexed in elasticsearch).
I tried an aggregation:
"aggs": {
"avg_mark": {
"avg": { "field": "marks.value" }
}
}
But i get an average of all students:
aggregations: { avg_mark: { value: 3.25 } }
I then tried with sort:
"sort": [{
"marks.value": {
"order": "desc",
"mode": "avg"
}
}]
It does well an average by student, but:
"hits" : [{
"_index" : "students",
"_type" : "_doc",
"_id" : "...",
"_score" : 1.0,
"_source" : {
"name" : "John Doe",
"marks" : [
{
"_id" : "...",
"value" : 4
},
{
"_id" : "...",
"value" : 0
}
]
},
"sort" : [ 2.0 ]
},
{
"_index" : "students",
"_type" : "_doc",
"_id" : "...",
"_score" : 1.0,
"_source" : {
"name" : "Jane Doe",
"marks" : [
{
"_id" : "...",
"value" : 5
},
{
"_id" : "...",
"value" : 4
}
]
},
"sort" : [ 4.5 ]
}, /* ... */]
This sort array could be [ 4.5, value_b, value_c, ... ]
or [value_b, value_c, 4.5 ]
depending of the sort search request property.
I also tried to work around with nested type without success.
How can i get an average by document / student without sort my result and with a way to retrieve it easily?
Thank you in advance.
Your first try was a step in the right direction -- just gotta make sure you group by the student names before you calculate the avg mark:
GET students/_search
{
"size": 0,
"aggs": {
"by_student": {
"terms": {
"field": "name.keyword",
"size": 10
},
"aggs": {
"avg_mark": {
"avg": {
"field": "marks.value"
}
}
}
}
}
}
The .keyword
field suffix is coming from this slightly adjusted mapping:
PUT students
{
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword" <--
}
}
},
"marks": {
"properties": {
"value": {
"type": "float"
}
}
}
}
}
}
BTW -- if you want to narrow the search to only a few students, simply include a top-level query along the lines of:
{
"query": {
"bool": {
"filter": [
{
"terms": {
"name.keyword": [
"John Doe",
"Jane Doe"
]
}
}
]
}
},
"aggs": { ... }
}
The aggregations will then take into consideration only the filtered set of documents.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.