[英]How to disable auto-generation of default fields in ElasticSearch Query with TermsAggregationBuilder and QueryBuilders
我正在从 ElasticSearch v1.0.0 迁移到 v7.13.1。 我知道 7.0.0 之后的 ElasticSearch 版本已经删除了对类型规范的支持。 此外,ElasticSearch 在类方面也做了一些改进,例如 TermsAggregationBuilder 替换了 TermsBuilder。
但是当我使用 QueryBuilders 和 AggregationBuilder 准备查询时,我可能会看到生成了一些我不想要的额外字段。
有没有办法以编程方式避免它们?
前
private TermsBuilder createAggreationsUriDetails() {
return AggregationBuilders
.terms(xxxxxxxx)...
后
private TermsAggregationBuilder createAggreationsUriDetails() {
return AggregationBuilders
.terms(ElasticConstants.URI)...
我还使用 matchQuery() 来准备与升级后的 ES 版本的匹配查询。 我仍然可以看到一些额外的字段。 订单也是如此。
查询新旧elasticsearch版本对比
前
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"uri.raw": {
"query": "sample_uri",
"type": "boolean"
}
}
},
{
"range": {
"@timestamp": {
"from": 1655145000000,
"to": 1655231400000,
"include_lower": "true",
"include_upper": "false",
"format": "epoch_millis"
}
}
}
]
}
},
"aggs": {
"uri": {
"terms": {
"field": "uri.raw",
"size": 1,
"order": {
"_count": "desc"
}
},
"aggregations": {
"client_id": {
"terms": {
"field": "client_id",
"size": 10000,
"order": {
"_count": "desc"
}
},
"aggregations": {
"response_code": {
"terms": {
"field": "response_code.raw",
"size": 8,
"order": {
"_count": "desc"
}
},
"aggregations": {
"datetime": {
"date_histogram": {
"field": "@timestamp",
"interval": "1m",
"min_doc_count": 1
}
}
}
}
}
}
}
}
}
}
使用新 ES 版本 QueryBuilder 开发的查询
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"uri.raw": {
"query": "sample_url",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": "true",
"lenient": "false",
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": "true",
"boost": 1
}
}
},
{
"range": {
"@timestamp": {
"from": 1655145000000,
"to": 1655231400000,
"include_lower": "true",
"include_upper": "false",
"format": "epoch_millis",
"boost": 1
}
}
}
],
"adjust_pure_negative": "true",
"boost": 1
}
},
"aggs": {
"uri": {
"terms": {
"field": "uri.raw",
"size": 1,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": "false",
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"client_id": {
"terms": {
"field": "client_id",
"size": 10000,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": "false",
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"response_code": {
"terms": {
"field": "response_code.raw",
"size": 8,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": "false",
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"datetime": {
"date_histogram": {
"field": "@timestamp",
"interval": "60000ms",
"offset": 0,
"order": {
"_key": "asc"
},
"keyed": "false",
"min_doc_count": 1
}
}
}
}
}
}
}
}
}
}
您看到的额外字段实际上是查询的参数,例如 7.X 中的match
查询,如下所示:
"match": {
"uri.raw": {
"query": "sample_url",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": "true",
"lenient": "false",
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": "true",
"boost": 1
}
}
这些operator
, prefix_length
, lenient
都是match
查询的参数,即使你不提供它也会添加它们的默认值,当你在没有这些参数的情况下以 JSON 格式点击查询时,这些将被添加到 Elasticsearch 端,所以不要担心它们,如果您愿意,可以更改其中一些参数值以查看对查询结果的相应影响,例如将operator
更改为AND
并且多术语的搜索结果数量将会减少。
注意:您还可以查看 Elasticsearch 代码库中的MatchQueryBuilder代码,以了解它们使用的是构建器设计模式,以及它们如何传递参数的默认值。
希望这可以帮助。
您不能禁用它,因为它是由 Elasticsearch 生成的默认值。
此外,您可以在控制台中打印查询时看到,但这是预期的行为。
下面是带有所需参数的简单匹配查询:
GET /_search
{
"query": {
"match": {
"message": {
"query": "this is a test"
}
}
}
}
但是,当您使用 Java 客户端创建上述查询并在控制台中打印时,它将如下所示,因为它还会打印其他默认参数值,并在执行查询时传递给 Elasticsearch。
GET /_search
{
"query": {
"match": {
"message": {
"query": "this is a test",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": "true",
"lenient": "false",
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": "true",
"boost": 1
}
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.