简体   繁体   中英

Why does filterQuery not work in Elastic Search's high level REST client for JAVA?

I am trying to create a function runs a fuzzy search on an Elastic Search index. I only get a match if I specify the term exactly as it is spelled in the index. If I intentionally mispell a single letter in that term like

"Boc"

, I imagine the fuzzy search should still return that same match, but instead it returns none. Simliarly, if I replace fuzzyMatch with prefixQuery or termQuery, the search only returns a result if given the exact spelling

"Bob"

Why is this? How do I fix this? And where is there documentation explaining these methods?

Here is my code...

public void searchResults(@PathParam("index_name") String index_name) throws IOException {
    RestHighLevelClient client = createHighLevelRestClient();
    int numberOfSearchHitsToReturn = 100; // defaults to 10
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(QueryBuilders.fuzzyQuery("firstname", "Bob"));
    sourceBuilder.from(0);
    sourceBuilder.size(numberOfSearchHitsToReturn);
    sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
    SearchRequest searchRequest = new SearchRequest(index_name).source(sourceBuilder);
    SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
    System.out.print(searchResponse);
    client.close();
}

Here is the result of Get /index/_search in Postman...

{
    "took": 0,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 3,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "contacts",
                "_type": "_doc",
                "_id": "J1NDonABNQ4iHt4UOM4u",
                "_score": 1.0,
                "_source": {}
            },
            {
                "_index": "contacts",
                "_type": "_doc",
                "_id": "153",
                "_score": 1.0,
                "_source": {
                    "firstname": "Bob",
                    "home_city": "San Diego",
                    "home_address": "1029 Loring Street",
                    "home_zip": "92109",
                    "contact_id": "153",
                    "email": "bsmith@gmail.com",
                    "lastname": "Smith",
                    "home_state": "California",
                    "cell_phone": "6192542981"
                }
            },
            {
                "_index": "contacts",
                "_type": "_doc",
                "_id": "154",
                "_score": 1.0,
                "_source": {
                    "firstname": "Alice",
                    "home_city": "Paia",
                    "home_address": "581 Pili Loko Street",
                    "home_zip": "00012",
                    "contact_id": "154",
                    "email": "aHernes@gmail.com",
                    "lastname": "Hernes",
                    "home_state": "Hawaii",
                    "cell_phone": "8083829103"
                }
            }
        ]
    }
}

I believe elastic confuses you a bit.

Fuzzines for 3 letter term is 1, so it is fair enough you expect "Bob" returned. However, I assume you use the standard analyzer which uses by default filter "lowercase".

So calculated Levenshtein distance between "Boc" and "bob" is 2 that's why it is not returned.

Try lowercase input term and I'm betting "Bob" will be returned.

// no results
{
    "query": {
       "fuzzy" : { "firstname" : "Boc" }
    }
}
// "Bob" returned
{
    "query": {
       "fuzzy" : { "firstname" : "boc" }
    }
}

Does this make sense?

Regarding your code:

public void searchResults(@PathParam("index_name") String index_name) throws IOException {
    RestHighLevelClient client = createHighLevelRestClient();
    int numberOfSearchHitsToReturn = 100; // defaults to 10
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    // "Boc".toLowerCase() or simply "boc"
    sourceBuilder.query(QueryBuilders.fuzzyQuery("firstname", "Boc".toLowerCase()));
    sourceBuilder.from(0);
    sourceBuilder.size(numberOfSearchHitsToReturn);
    sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
    SearchRequest searchRequest = new SearchRequest(index_name).source(sourceBuilder);
    SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
    System.out.print(searchResponse);
    client.close();
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM