简体   繁体   中英

how to escape plus sign in lucene library 4.1.0

the query i have is cs_contents:(canal+) OR cs_docs:(canal+) OR cs_annots:(canal+)

when is passed in lucene the query become +((cs_contents:canal cs_contents:canal) (cs_docs:canal cs_docs:canal) (cs_annots:canal cs_annots:canal)) +DBName:dPortal +TableName:CASE_ACTION

even if i escape the plus with backslash it doesn't work cause the backslash is a special character in this library too.

然后,我建议您也尝试转义反斜杠:\\\\ +

我认为您可以将这样的文字写在这样的引号中。

cs_contents:"(canal+)" OR cs_docs:"(canal+)" OR cs_annots:"(canal+)"

Double backslash might do the trick (that's how org.apache.lucene.queryparser.flexible.standard.QueryParserUtil does it), but this will work if and only if the + is in the field index !

If you tokenized the field during indexing the + character is not likely to be a part of the indexed value, and if you are using the same tokenizing analyzer at query parsing you're not going to search to the + either, regardless of escaping (query string is passed via analyzer).

One workaround for that is not to tokenize fields with relevant special chars - and use a non-tokenizing analyzer (eg a KeywordAnalyzer ) at query parsing - if you can distinguish the queries with special chars from the ones without...

Usually the values used by Lucene (in index, in queries) are NOT the exact values passed to lucene, eg usually all strings are lowercased (self-explanatory) and tokenized (split into words, stripped of special characters). This depends on Analyzer used, field types, etc.

Indexing :

Analyzer(field value) = value stored in index

Query time :

QueryParser(Analyzer(query string)) = query passed to Lucene

The QueryParser documentation explains what should be escaped. The programmatic way to perform such escaping is using QueryParserBase.escape(String) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM