简体繁体 English

cfsearch - 执行查询时出错：org.apache.lucene.queryParser.ParseException：无法解析：词法错误

[英]cfsearch - Error executing query : org.apache.lucene.queryParser.ParseException: Cannot parse : Lexical error

原文 2016-01-08 13:35:28 9 1 coldfusion/ lucene/ cfsearch

I've got a basic cfsearch that works fine, but occasionally it can be broken with search strings like the following; 我有一个基本的cfsearch工作正常，但偶尔它可以打破搜索字符串，如下所示;

my search string] 我的搜索字符串]
"my search string “我的搜索字符串
my search string[ 我的搜索字符串[
my search: string 我的搜索：字符串

Any of the above will result in an error like; 以上任何一种都会导致错误;

Error executing query : org.apache.lucene.queryParser.ParseException: Cannot parse '"my search string': Lexical error at line 1, column 32. Encountered: after : "\\"my search string" 执行查询时出错：org.apache.lucene.queryParser.ParseException：无法解析“我的搜索字符串”：第1行第32列的词法错误。遇到：之后：“\\”我的搜索字符串“

I was thinking I could strip out those characters, but you might have a working search term with, say, two "" - ie. 我以为我可以删除那些角色，但你可能有一个有效的搜索词，比方说，两个“” - 即。 "my search string" - which is valid. “我的搜索字符串” - 这是有效的。 Is there a preferable way to prepare a string for cfsearch? 是否有一种更好的方法为cfsearch准备字符串？

So, in the example of: 因此，在以下示例中：

"my search string “我的搜索字符串

it would strip out the first ". But if the search term was: 它会删除第一个“。但如果搜索词是：

"my search string" “我的搜索字符串”

all good - leave it alone. 一切都好 - 不要管它。 Any ideas?! 有任何想法吗？！ Are there any other characters that can cause an error? 是否还有其他可能导致错误的字符？ For example, a hacker tried this; 例如，黑客试过这个;

XyOk,'.](.]]]' XyOk， ']（。]]]'

Which caused an error. 这导致了一个错误。

1 个解决方案

Use the VerityClean UDF from CFLib to sanitize the Verity/Lucene search parameter. 使用VerityClean UDF从CFLib消毒时，Verity / Lucene搜索参数。 (NOTE: Add : , ^ and * to the pipe-delimited reBadChars variable so they will be stripped for Lucene.) （注意：在管道分隔的reBadChars变量中添加: ， ^和* ，以便为Lucene剥离它们。）