简体   繁体   中英

Java elasticsearch, find by part of the word

I have small problem with Java elasticsearch (2.3.3)

TransportClient client = TransportClient.builder().build()
                .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"), 9300));
QueryBuilder qb = multiMatchQuery(
            "org", // George
            "firstname","lastname"

    ).fuzziness(Fuzziness.build(2));


    SearchResponse response = client.prepareSearch("user")
            .setQuery(qb)
            .execute()
            .get();

    for(SearchHit hit : response.getHits()){
        System.out.println(hit.getSource());
    }

By fuzziness, I can find when I did not type 2 letters.

I want it to find a user by firstname or lastname, by 3 or more letters. I was searching for a way to do that last few hours.

In this case I need to find "George Michel" by typing just "org", but no luck. But someone can type "Gegorge Jackson", and then I should find "Geroge Michel" and "Michael Jackson".

Thanks for help.

You can use NGram tokenizer in elasticsearch. What does NGram tokenizer does ? Suppose you have a string "day" so it will split the string into "d", "a", "y", "da", "ay", "day" which helps in searching the query. This does have limit. max and min length

For more : https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-ngram-tokenizer.html

To Search like below code can be used.

For Example : Column Name is Address Values present :

  1. 123, Springfield, 68 Main Street, IL
  2. 248, Spring, 104 Street, MA

Search String : "spring"

QueryBuilders.boolQuery().should(QueryBuilders.queryStringQuery("*"+spring+"*").lenient(true).field("Address"))

Above both result will return, while below Code will return only one(ie2) result.

QueryBuilders.boolQuery().should(QueryBuilders.queryStringQuery("spring").lenient(true).field("Address"))

Notice in First query builder I have used '*'.

I was searching and have found something like this :

XContentBuilder settingsBuilder = XContentFactory.jsonBuilder()
            .startObject()
                .startObject("analysis")
                .startObject("tokenizer")
                    .startObject("my_ngram_tokenizer")
                    .field("type","nGram")
                    .field("min_gram",1)
                    .field("max_gram",1)
                    .endObject()
                .endObject()
                .startObject("analyzer")
                    .startObject("ShingleAnalyzer")
                        .field("tokenizer","my_ngram_tokenizer")
                        .array("filter","standard","lowercase")
                    .endObject()
                .endObject()
                .endObject()
            .endObject();

    this.client.admin().indices()
            .prepareCreate("user").setSettings(settingsBuilder).get();

But nothing was changed, what I did wrong? Ouch.

EDIT: It works only for "geo", but not without .fuzziness(Fuzziness.build(2));

QueryBuilder qb = multiMatchQuery(
            search,
            "firstname","lastname"
    ).analyzer("ShingleAnalyzer");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM