简体   繁体   English

如何逃脱加号在lucene库4.1.0

[英]how to escape plus sign in lucene library 4.1.0

the query i have is cs_contents:(canal+) OR cs_docs:(canal+) OR cs_annots:(canal+) 我有的查询是cs_contents:(canal+) OR cs_docs:(canal+) OR cs_annots:(canal+)

when is passed in lucene the query become +((cs_contents:canal cs_contents:canal) (cs_docs:canal cs_docs:canal) (cs_annots:canal cs_annots:canal)) +DBName:dPortal +TableName:CASE_ACTION 在被传递时lucene查询成为+((cs_contents:canal cs_contents:canal) (cs_docs:canal cs_docs:canal) (cs_annots:canal cs_annots:canal)) +DBName:dPortal +TableName:CASE_ACTION

even if i escape the plus with backslash it doesn't work cause the backslash is a special character in this library too. 即使我用反斜杠转义了加号,它也不起作用,因为反斜杠也是该库中的特殊字符。

然后,我建议您也尝试转义反斜杠:\\\\ +

我认为您可以将这样的文字写在这样的引号中。

cs_contents:"(canal+)" OR cs_docs:"(canal+)" OR cs_annots:"(canal+)"

Double backslash might do the trick (that's how org.apache.lucene.queryparser.flexible.standard.QueryParserUtil does it), but this will work if and only if the + is in the field index ! 双反斜杠可能可以解决问题(这就是org.apache.lucene.queryparser.flexible.standard.QueryParserUtil工作方式), 但是只有当 +在字段索引中时这才起作用

If you tokenized the field during indexing the + character is not likely to be a part of the indexed value, and if you are using the same tokenizing analyzer at query parsing you're not going to search to the + either, regardless of escaping (query string is passed via analyzer). 如果索引中标记化领域+字符不是可能是索引值的一部分,如果你正在使用的查询相同的标记化分析仪分析你不会搜到+要么,无论逃逸(查询字符串是通过分析器传递的)。

One workaround for that is not to tokenize fields with relevant special chars - and use a non-tokenizing analyzer (eg a KeywordAnalyzer ) at query parsing - if you can distinguish the queries with special chars from the ones without... 一种解决方法是不要用相关特殊字符对字段进行标记-如果在查询解析时可以使用特殊字符与无特殊字符的查询区分开来,则在查询解析时使用非标记分析器(例如KeywordAnalyzer )。

Usually the values used by Lucene (in index, in queries) are NOT the exact values passed to lucene, eg usually all strings are lowercased (self-explanatory) and tokenized (split into words, stripped of special characters). 通常,Lucene使用的值(在索引中,在查询中)不是传递给Lucene的确切值,例如,通常所有字符串都小写(不言自明)并标记化(分割成单词,去除特殊字符)。 This depends on Analyzer used, field types, etc. 这取决于使用的分析器,字段类型等。

Indexing : 索引

Analyzer(field value) = value stored in index

Query time : 查询时间

QueryParser(Analyzer(query string)) = query passed to Lucene

The QueryParser documentation explains what should be escaped. QueryParser文档说明了应转义的内容。 The programmatic way to perform such escaping is using QueryParserBase.escape(String) . 执行此类转义的编程方式是使用QueryParserBase.escape(String)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM