how to properly escape a lucene query?

Question

I'm quite new to Lucene and recently, I ran into a problem. I have a lucene document that looks like this:

--- type ---
gene
--- id ---
xla:379474
--- alt_id ---
emb:BC054227
gb:BC054227
ncbi-geneid:379474
ncbi-gi:148230166
rs:NM_001086315
rs:NP_001079784
unigene:Xl.24622
xla:379474

I created the query bellow in order to retrieve that document. It works fine for altId = 379474 but not for altId = ncbi-geneid:379474 or Xl.24622 . I guessed altId must be escaped and tried String altId = QueryParser.escape(altId) with no luck. Is that the expected behavior of the query?, I'm I missing something?

Query query1 = new TermQuery(new Term("type", "gene"));
Query query2 = new TermQuery(new Term("alt_Id", altId));

BooleanQuery query = new BooleanQuery();
query.add(query1, BooleanClause.Occur.MUST);
query.add(query2, BooleanClause.Occur.MUST);

By the way I'm running lucene v3.0.

Answer 1

This should help you. Try and let me know. http://www.strongd.net/?p=44

Answer 2

Turns out the problem was not related to escaping but to the way alt_id was indexed and the use of TermQuery . There is 2 possible solutions:

Replace TermQuery with the output from QueryParser.parse , with QueryParser created with StandardAnalyzer .
Or index alt_id as Index.NOT_ANALYZED and stick with TermQuery .

I implemented the last one and it worked well.

how to properly escape a lucene query?

Question

2 answers

solution1
1 2012-06-22 09:26:04

solution2
0 ACCPTED 2012-06-26 09:23:29

how to properly escape a lucene query?

Question

2 answers

solution1 1 2012-06-22 09:26:04

solution2 0 ACCPTED 2012-06-26 09:23:29

solution1
1 2012-06-22 09:26:04

solution2
0 ACCPTED 2012-06-26 09:23:29