[英]ElasticSearch. Total number of unique terms in an index
Is there a way to access the total number of terms in an index through ES API? 是否可以通过ES API访问索引中的术语总数? I need to estimate the prior probability of a term occurring in the index: 我需要估计一个术语在索引中出现的先验概率:
total_term_frequency/total_terms_in_index
I can access ttf
but no total number of terms stored in the index. 我可以访问ttf
但是索引中没有存储的术语总数。
I think the cardinality aggregation is what you're looking for. 我认为基数汇总是您想要的。
For example: 例如:
POST /test_index/_search
{
"size": 0,
"aggs": {
"term_count": {
"cardinality": {
"field": "doc_text"
}
}
}
...
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 0,
"hits": []
},
"aggregations": {
"term_count": {
"value": 161
}
}
}
Here is some code I used to play around with it: 这是我以前玩过的一些代码:
http://sense.qbox.io/gist/d5625c80946f332718b0fa166bba27efd264b76e http://sense.qbox.io/gist/d5625c80946f332718b0fa166bba27efd264b76e
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.