[英]How to boost relevance for exact match results when using wildcard in sphinx?
This is my search request: 这是我的搜索请求:
(new SphinxSearch())
->search((new SphinxClient())->escapeString($query) . '*', 'services')
->setMatchMode(SphinxClient::SPH_MATCH_EXTENDED)
->setFieldWeights([
'name' => 10,
'legal_name' => 10,
'description' => 10,
'keywords' => 10,
'category_name' => 3,
'categories' => 3,
])
->setSortMode(SphinxClient::SPH_SORT_EXTENDED, "@weight DESC")
->setRankingMode(SphinxClient::SPH_RANK_SPH04)
->get(true);
And here is index config: 这是索引配置:
index services
{
source = services
path = /var/lib/sphinxsearch/data/services
docinfo = extern
morphology = stem_enru
min_stemming_len = 1
min_word_len = 1
min_infix_len = 1
html_strip = 1
index_exact_words = 1
expand_keywords = 1
mlock = 0
charset_table = 0..9, A..Z->a..z, _, *, -, a..z, \
U+2C->U+2E, U+2E, U+0044, U+0046, U+0130, U+0401->U+0435, U+0451->U+0435, U+410..U+42F->U+430..U+44F, U+430..U+44F
}
For a query " school №4 " it returns all relevant results, but with something like " school №42 " at the top and the exact match of " school №4 " is close to the bottom of the result set. 对于查询“ 学校№4”将返回所有相关结果,但类似“ 学校№42”在顶部和“ 学校№4”的精确匹配是接近的结果集的底部。
Well, it's not actually an exact match - in the service name there may be some other words and symbols. 嗯,这实际上不是完全匹配的,在服务名称中可能还包含其他一些单词和符号。 But it's closest to what user entered in the search field, so I believe it should be more relevant than results with a wild card.
但这与用户在搜索字段中输入的内容最接近,因此我认为它应该比使用通配符的结果更相关。
How can I move "exact" match to the top of the set? 如何将“完全匹配”移动到集合顶部?
PS I'm using this Laravel specific wrapper for SphinxClient, though I don't think it's important. PS我使用这个 Laravel特定包装器SphinxClient,虽然我不认为这是非常重要的。
One option could, try the expand_keywords
option http://sphinxsearch.com/docs/current.html#conf-expand-keywords 一种选择是,尝试
expand_keywords
选项http://sphinxsearch.com/docs/current.html#conf-expand-keywords
can possibly improve the search quality, as the documents with exact form matches should be ranked generally higher than documents with stemmed or infix matches.
可能会提高搜索质量,因为具有完全匹配形式的文档通常应比具有词干匹配或中缀匹配的文档排名更高。
Alas it's a index level config, rather than query. las,这是一个索引级别的配置,而不是查询。 Can then remove the * from the query.
然后可以从查询中删除*。
Current solution that I have - make two queries with and without wildcard and then merge results with exact matches on the top. 我目前拥有的解决方案-使用和不使用通配符进行两个查询,然后在顶部合并具有完全匹配项的结果。 It works, but not ideal, obviously.
很明显,它可以工作,但并不理想。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.