简体   繁体   English

C#中的Lucene查询找不到标点符号的结果

[英]Lucene query in C# not finding results with punctuation

I have a search bar that executes a lucene query on the "description" field, but it doesn't return results when with apostrophes. 我有一个搜索栏,可在“说明”字段上执行lucene查询,但带有撇号时不会返回结果。 For example, I have a product where the description is Herter's® EZ-Load 200lb Feeder - 99018 . 例如,我有一个描述为Herter's® EZ-Load 200lb Feeder - 99018 When I search for "Herter", I get results, but I get no results if I search for "Herter's" or "Herters". 当我搜索“ Herter”时,会得到结果,但是如果搜索“ Herter's”或“ Herters”,则不会得到结果。 This is my search code: 这是我的搜索代码:

var query = Request.QueryString["q"];
var search = HttpContext.Current.Server.UrlDecode(query);

var rewardProductLookup = new RewardCatalogDataHelper();
RewardProductSearchCriteria criteria = new RewardProductSearchCriteria()
{
    keywords = search,
    pageSize = 1000,
    sortDirection = "desc"
};

IEnumerable<SkinnyItem> foundProducts = rewardProductLookup.FindByKeywordQuery(criteria);

public IEnumerable<SkinnyItem> FindByKeywordQuery(RewardProductSearchCriteria query)
{
    var luceneIndexDataContext = new LuceneDataContext("rewardproducts", _dbName);
    string fieldToQuery = "rpdescription";
    bool sortDirection = query.sortDirection.ToLower().Equals("desc");

    MultiPhraseQuery multiPhraseQuery = new MultiPhraseQuery();
    var keywords = query.keywords.ToLower().Split(',');
    foreach (var keyword in keywords)
    {
        if (!String.IsNullOrEmpty(keyword))
        {
            var term = new Term(fieldToQuery, keyword);
            multiPhraseQuery.Add(term);
        }
    }

    var booleanQuery = new BooleanQuery();
    booleanQuery.Add(multiPhraseQuery, BooleanClause.Occur.MUST);

    return
        luceneIndexDataContext.BooleanQuerySearch(booleanQuery, fieldToQuery, sortDirection)
            .Where(i => i.Fields["eligibleforpurchase"] == "1");
}

The problem here is analysis. 这里的问题是分析。 You haven't specified the analyzer being used in this case, so I'll assume it's StandardAnalyzer . 您尚未指定在这种情况下使用的分析器,因此我假设它是StandardAnalyzer

When analyzed, the term "Herter's" will be translated to "herter". 经过分析,术语“ Herter's”将翻译为“ herter”。 However, no analyzer is being applied in your FindByKeywordQuery method, so looking for "herter" works, but "herter's" doesn't. 但是,您的FindByKeywordQuery方法中没有应用分析器,因此查找“ herter”有效,但“ herter's”无效。

One solution would be to use the QueryParser , in stead of manually constructing a MultiPhraseQuery . 一种解决方案是使用QueryParser ,而不是手动构造MultiPhraseQuery The QueryParser will handle tokenizing, lowercasing, and such. QueryParser将处理标记化,小写等。 Something like: 就像是:

QueryParser parser = new QueryParser(VERSION, "text", new StandardAnalyzer(VERSION));
Query query = parser.Parse("\"" + query.keywords + "\"");

The single quote is the delimiter for text fields in a query. 单引号是查询中文本字段的分隔符。

Select * FROM Product where Description = 'foo' 

You will need to escape or double any single quote your query. 您将需要对查询的任何单引号进行转义或加倍。 try this in the loop. 尝试循环。

foreach (var keyword in keywords)
{
    if (!String.IsNullOrEmpty(keyword))
    {
        var term = new Term(fieldToQuery, keyword);
        term = term.Replace("'", "''");
        multiPhraseQuery.Add(term);
    }
}

You could also create an extension method 您还可以创建扩展方法

    [DebuggerStepThrough]
    public static string SanitizeSQL(this string value)
    {
        return value.Replace("'", "''").Replace("\\", "\\\\");
    }

in which case you could then you could do this in the loop 在这种情况下,您可以在循环中执行此操作

foreach (var keyword in keywords)
{
    if (!String.IsNullOrEmpty(keyword))
    {
        var term = new Term(fieldToQuery, keyword.SanitizeSQL());
        multiPhraseQuery.Add(term);
    }
}

Hope this helps. 希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM