简体   繁体   中英

Perform Lucene wildcard search

How does Lucene support wild card searches?

I want to search the words starting with ox I am searching the words with ox* but its also returning the unexpected results like anti-oxide - but I don't want it as result.

This has to do with how your data is analyzed. The StandardAnalyzer will separate terms on spaces and punctuation (among many other rules ). Standard Analyzer is usually well suited to full text. If it doesn't suit your particular needs, many other analyzers are available. Without more information on what you intend to accomplish, I can't really recommend a particular one.

According to Lucene FAQ your query ox* should only match terms that begin with ox.

Because the StandardTokenizer considers hyphen as a delimiter , a word like anti-oxide is split into two words - anti and oxide and hence the match anti-oxide when you search for ox* .

You have 2 options to change this behavior:

  1. Override the default Tokenizer and write your own to suit your needs
  2. Ugly pre-processing of your text to replace/remove such delimiters. This may not be an ideal solution

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM