[英]how to properly escape a lucene query?
I'm quite new to Lucene and recently, I ran into a problem. 我对Lucene还是很陌生,最近,我遇到了一个问题。 I have a lucene document that looks like this:
我有一个lucene文档,看起来像这样:
--- type ---
gene
--- id ---
xla:379474
--- alt_id ---
emb:BC054227
gb:BC054227
ncbi-geneid:379474
ncbi-gi:148230166
rs:NM_001086315
rs:NP_001079784
unigene:Xl.24622
xla:379474
I created the query bellow in order to retrieve that document. 我创建了以下查询以检索该文档。 It works fine for
altId = 379474
but not for altId = ncbi-geneid:379474
or Xl.24622
. 它对于
altId = 379474
正常工作,但对于altId = ncbi-geneid:379474
或Xl.24622
。 I guessed altId must be escaped and tried String altId = QueryParser.escape(altId)
with no luck. 我猜想altId必须转义,并尝试使用
String altId = QueryParser.escape(altId)
运气不好。 Is that the expected behavior of the query?, I'm I missing something? 这是查询的预期行为吗?我丢失了一些东西吗?
Query query1 = new TermQuery(new Term("type", "gene"));
Query query2 = new TermQuery(new Term("alt_Id", altId));
BooleanQuery query = new BooleanQuery();
query.add(query1, BooleanClause.Occur.MUST);
query.add(query2, BooleanClause.Occur.MUST);
By the way I'm running lucene v3.0. 顺便说一下,我正在运行lucene v3.0。
This should help you. 这应该对您有帮助。 Try and let me know.
尝试让我知道。 http://www.strongd.net/?p=44
http://www.strongd.net/?p=44
Turns out the problem was not related to escaping but to the way alt_id
was indexed and the use of TermQuery
. 事实证明,该问题与转义无关,而与
alt_id
索引方式以及TermQuery
的使用有关。 There is 2 possible solutions: 有两种可能的解决方案:
TermQuery
with the output from QueryParser.parse
, with QueryParser
created with StandardAnalyzer
. TermQuery
替换为QueryParser.parse
的输出,并替换为使用StandardAnalyzer
创建的QueryParser
。 alt_id
as Index.NOT_ANALYZED
and stick with TermQuery
. alt_id
为Index.NOT_ANALYZED
和坚持TermQuery
。 I implemented the last one and it worked well. 我实施了最后一个,效果很好。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.