简体   繁体   中英

How to index a String in Lucene?

I'm using Lucene to index strings which I read from document. I'm not using reader class, since I need to index string to different fields.

document.add(new Field("FIELD1","string1", Field.Store.YES, Field.Index.UNTOKENIZED));
document.add(new Field("FIELD2","string2", Field.Store.YES, Field.Index.UNTOKENIZED));

This works in building the index but searching

QueryParser queryParser = new QueryParser("FIELD1", new StandardAnalyzer());
Query query = queryParser.parse(searchString);
Hits hits = indexSearcher.search(query);
System.out.println("Number of hits: " + hits.length());

doesn't returns any result.

But when I index a sentence like,

document.add(new Field("FIELD1","This is sentence to be indexed", Field.Store.YES, Field.Index.TOKENIZED));

searching works fine.

Thanks.

You need to set the parameter for the fields with the words also to Field.Index.TOKENIZED because searching is only possible when you tokenize. The word "string1" will be indexed as "string1". Without tokenization it won't be indexed at all.

Use this:

document.add(new Field("FIELD1","string1", Field.Store.YES, Field.Index.TOKENIZED));
document.add(new Field("FIELD2","string2", Field.Store.YES, Field.Index.TOKENIZED));

When you want to index a string containing multiple words, eg "two words" as one searchable element without tokenizing into 2 words, you either need to use the KeywordAnalyzer during indexing which takes the whole string as a token or you can use the StringField object in newer versions of Lucene.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM