[英]Elasticsearch multiple values match without analyzer
Pardon my knowledge on ElasticSearch. 请原谅我对ElasticSearch的了解。 I have an Elasticsearch collection which has documents like these:
我有一个Elasticsearch集合,其中包含以下文档:
{
"date": "2013-12-30T00:00:00.000Z",
"value": 2,
"dimensions": {
"region": "Coimbra District"
}
}
{
"date": "2013-12-30T00:00:00.000Z",
"value": 1,
"dimensions": {
"region": "Federal District"
}
}
{
"date": "2013-12-30T00:00:00.000Z",
"value": 1,
"dimensions": {
"region": "Masovian Voivodeship"
}
}
These 3 json documents are indexed in the ES server. 这3个json文档在ES服务器中编制索引。 I haven't provided any analyzer type (and don't know how to provide one either :)) I am using spring data Elasticsearch and executing the following query to search for the docs with region 'Masovian Voivodeship' or 'Federal District':
我没有提供任何分析器类型(并且不知道如何提供一个:) :)我使用弹簧数据Elasticsearch并执行以下查询来搜索区域'Masovian Voivodeship'或'Federal District'的文档:
{
"query_string" : {
"query" : "Masovian Voivodeship OR Federal District",
"fields" : [ "dimensions.region" ]
}
}
I am expecting it to return 2 hits. 我期待它返回2次点击。 However, it returns all 3 docs (probably due to 3rd one having district in it).
但是,它会返回所有3个文档(可能是因为第3个文档中包含了区域)。 How can I modify the query so that it can perform the EXACT match and only provide 2 documents?
如何修改查询以便它可以执行完全匹配并仅提供2个文档? I am using following method:
我使用以下方法:
QueryBuilders.queryString(<OR string>).field("dimensions.region")
I have tried QueryBuilders.termsQuery
, QueryBuilders.inQuery
and QueryBuilders.matchQuery
(with array) but no luck. 我尝试过
QueryBuilders.termsQuery
, QueryBuilders.inQuery
和QueryBuilders.matchQuery
(带数组),但没有运气。
Can anyone please help? 有人可以帮忙吗? Thanks in advance.
提前致谢。
There are a couple of things you can do here. 你可以在这里做几件事。
To start, I set up an index without any explicit mapping or analysis, which means the standard analyzer will be used. 首先,我设置了一个没有任何显式映射或分析的索引,这意味着将使用标准分析器 。 That's important since it determines how we can query against the text fields.
这很重要,因为它决定了我们如何查询文本字段。
So I started with: 所以我开始:
DELETE /test_index
PUT /test_index
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
}
}
PUT /test_index/doc/1
{
"date": "2013-12-30T00:00:00.000Z",
"value": 2,
"dimensions": {
"region": "Coimbra District"
}
}
PUT /test_index/doc/2
{
"date": "2013-12-30T00:00:00.000Z",
"value": 1,
"dimensions": {
"region": "Federal District"
}
}
PUT /test_index/doc/3
{
"date": "2013-12-30T00:00:00.000Z",
"value": 1,
"dimensions": {
"region": "Masovian Voivodeship"
}
}
Then I tried your query and got no hits. 然后我尝试了你的查询,没有点击。 I don't understand why you have
"dimensions.ga:region"
in your fields
parameter, but when I changed it to "dimensions.region"
I got some results: 我不明白为什么你的
fields
参数中有"dimensions.ga:region"
,但是当我把它改成"dimensions.region"
我得到了一些结果:
POST /test_index/doc/_search
{
"query": {
"query_string": {
"query": "Masovian Voivodeship OR Federal District",
"fields": [
"dimensions.region"
]
}
}
}
...
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0.46911472,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "3",
"_score": 0.46911472,
"_source": {
"date": "2013-12-30T00:00:00.000Z",
"value": 1,
"dimensions": {
"region": "Masovian Voivodeship"
}
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 0.3533006,
"_source": {
"date": "2013-12-30T00:00:00.000Z",
"value": 1,
"dimensions": {
"region": "Federal District"
}
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.05937162,
"_source": {
"date": "2013-12-30T00:00:00.000Z",
"value": 2,
"dimensions": {
"region": "Coimbra District"
}
}
}
]
}
}
However, this returns a result you don't want. 但是,这会返回您不想要的结果。 One way to fix that is as follows:
解决这个问题的一种方法如下:
POST /test_index/doc/_search
{
"query": {
"query_string": {
"query": "(Masovian AND Voivodeship) OR (Federal AND District)",
"fields": [
"dimensions.region"
]
}
}
}
...
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.46911472,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "3",
"_score": 0.46911472,
"_source": {
"date": "2013-12-30T00:00:00.000Z",
"value": 1,
"dimensions": {
"region": "Masovian Voivodeship"
}
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 0.3533006,
"_source": {
"date": "2013-12-30T00:00:00.000Z",
"value": 1,
"dimensions": {
"region": "Federal District"
}
}
}
]
}
}
Another way would to do it (I like this one better) which gives the same results is to use a combination of match query and boolean should : 另一种方法(我更喜欢这个)给出了相同的结果是使用匹配查询和布尔的组合应该 :
POST /test_index/doc/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"dimensions.region": {
"query": "Masovian Voivodeship",
"operator": "and"
}
}
},
{
"match": {
"dimensions.region": {
"query": "Federal District",
"operator": "and"
}
}
}
]
}
}
}
Here is the code I used: 这是我使用的代码:
http://sense.qbox.io/gist/bb5062a635c4f9519a411fdd3c8540eae8bdfd51 http://sense.qbox.io/gist/bb5062a635c4f9519a411fdd3c8540eae8bdfd51
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.