[英]Using django-haystack +Elasticsearch how can I search subsets of a word?
If I enter the query "apple" i wish to get the results like "xyzapplexyz","apple","applexyz" and NOT like "app" or "appl" . 如果我输入查询“苹果”我希望得到的结果,如“xyzapplexyz”,“苹果”,“applexyz”,而不是像“应用”或“申请”。 But what I am getting is "applexyz", "app" etc. 但是我得到的是“ applexyz”,“ app”等。
I have used EdgeNgram field and I have tried querying using the following :- 我使用了EdgeNgram字段,并尝试使用以下命令进行查询:-
1-->> SearchQuerySet().all().autocomplete(authors=query)
1->> SearchQuerySet().all().autocomplete(authors=query)
2-->> SearchQuerySet().all().filter(authors=query)
2->> SearchQuerySet().all().filter(authors=query)
3-->> SearchQuerySet().all().filter(content=query)
3->> SearchQuerySet().all().filter(content=query)
4-->> SearchQuerySet().all().autocomplete(content=query)
4->> SearchQuerySet().all().autocomplete(content=query)
But none of them gives the required results. 但是它们都没有给出所需的结果。 How can I resolve this issue? 我该如何解决这个问题?
If you want results like "xyzapplexyz"
, then you would need to use ngram analyzer instead of EdgeNGram
or you could use both depending on your requirements. 如果您想要类似"xyzapplexyz"
结果,则需要使用ngram分析器而不是EdgeNGram
或者可以根据需要使用两者。 EdgeNGram
generates tokens only from the beginning. EdgeNGram
仅从一开始就生成令牌。
with NGram
apple will be one of the generated tokens for term xyzapplexyz assuming max_gram >=5
and you will get expected results, also search_analyzer
needs to be different or you will get weird results. 假设max_gram >=5
,使用NGram
苹果将是术语xyzapplexyz生成的令牌之一,您将获得预期的结果,而且search_analyzer
也需要不同,否则您将获得怪异的结果。
Also index size
might get pretty big with ngram
if you have huge chunk of text 另外,如果您有大量文本,则使用ngram
index size
可能会变得很大
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.