[英]Elasticsearch wildcard search and relevance
I am trying to implement wildcard for a suggestion dropdown. 我正在尝试为建议下拉列表实现通配符。 I have a few days already since I try to figure out this.
自从我试图找出这个以来,我已经有几天了。 :(
:(
I have a list of restaurants (4000-7000). 我有一份餐馆名单(4000-7000)。 I want to search with wildcard in restaurant names and to display first the results where search is in front of text.
我想在餐馆名称中使用通配符进行搜索,并首先显示搜索位于文本前面的结果。
I tried to index the name field without analyzer, with ngram analyzer and many other solutions I found on the net but without luck. 我尝试在没有分析器的情况下索引名称字段,使用ngram分析器和我在网上找到的许多其他解决方案,但没有运气。
Best results by now I get by with this setup: 现在最好的结果我得到了这个设置:
settings:
analysis: {
analyzer: {
default: {
tokenizer: :keyword,
filter: [:lowercase]
}
}
}
And index name field like this: 和索引名称字段如下:
indexes :name, type: :string, analyzer: :default
Search : query: {wildcard: {name: '*le*'}} 搜索 :查询:{wildcard:{name:'* le *'}}
Result : Mr. Beef on Orleans, Miller's Pub, Merlo on Maple, Le Bouchon, Les Nomades, Leonardo's Ristorante, Lem's Bar-BQ House, Le Petit Paris, Joy Yee's Noodles - Chinatown, J. Alexander's (Lincoln Park), Indian Garden - Streeterville, Goose Island Brewpub - Wrigleyville, Tweet ... Let's Eat!, Arco de Cuchilleros, Al's #1 Italian Beef - Little Italy 结果 :奥尔良牛肉先生,米勒酒吧,枫叶梅洛,Le Bouchon,Les Nomades,Leonardo's Ristorante,Lem's Bar-BQ House,Le Petit Paris,Joy Yee's Noodles - 唐人街,J。Alexander(林肯公园),印度花园 - Streeterville,Goose Island Brewpub - Wrigleyville,Tweet ...让我们吃吧!,Arco de Cuchilleros,Al's#1意大利牛肉 - 小意大利
I want that the results that start with ' le ' to be in front, to have a higher score. 我希望以' le '开头的结果在前面,以获得更高的分数。 Because usually the people search for a restaurant that starts with.
因为通常人们会搜索一个以餐馆开头的餐馆。 But I can not search without * in front because I do want also the results that contain this but with lower score in the results.
但是我不能在没有*的情况下进行搜索,因为我确实也想要包含此结果但结果中得分较低的结果。 For example above 'Le Colonial', 'Le Petit Paris', 'Les Nomades' should be in front.
比如上面的'Le Colonial','Le Petit Paris','Les Nomades'应该在前面。
How can I accomplish this? 我怎么能做到这一点?
The other concern I have it's performance. 另一个问题是我的表现。 I know that wildcard in booth ends it's the worst case possible but I could not find any solution that gives me something ok in result with ngram or shingle.
我知道展位中的通配符结束了,这是最糟糕的情况,但我找不到任何解决方案,给我一些结果与ngram或shingle一样好。
Use boost to pick the first match on top. 使用提升选择顶部的第一场比赛。
Using two wildcard query 使用两个通配符查询
curl -XPOST "http://hostname:9200/index/type/_search" -d'
{
"size": 2000,
"query": {
"bool": {
"should": [
{
"wildcard": {
"name": {
"value": "*le*"
}
}
},
{
"wildcard": {
"name": {
"value": "le*",
"boost": 5
}
}
}
]
}
}
}'
Using one wildcard and one prefixquery 使用一个通配符和一个prefixquery
curl -XPOST "http://hostname:9200/index/type/_search" -d'
{
"size": 2000,
"query": {
"bool": {
"should": [
{
"wildcard": {
"name": {
"value": "*le*"
}
}
},
{
"prefix": {
"name": {
"value": "le",
"boost": 2
}
}
}
]
}
}
}'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.