简体   繁体   English

Django Haystack子串搜索

[英]Django Haystack substring search

I have recently added search capabilities to my django-powered site to allow employers to search for employees using keywords. 我最近在我的django支持的网站上添加了搜索功能,允许雇主使用关键字搜索员工。 When the user initially uploads their resume, I turn it into text, get rid of stop words, and then add the text to a TextField for that user. 当用户最初上传他们的简历时,我将其转换为文本,删除停用词,然后将文本添加到该用户的TextField。 I used Django-Haystack with the Whoosh search back engine. 我使用了Django-Haystack和Whoosh搜索引擎。

Three things- 三件事-

1) Aside from extra features which I'll probably not use, is there any concrete advantage to switching to Solr or Xapian? 1)除了我可能不会使用的额外功能之外,切换到Solr或Xapian有什么具体的优势吗?

2) In turning the resume into text, I essentially index the pdf myself. 2)在将简历变成文本时,我基本上将自己编入索引。 I know both Xapian and Solr support .pdf indexing, however, from the looks of it Haystack does not. 我知道Xapian和Solr都支持.pdf索引,但是,从它的外观来看,Haystack没有。 Any tips on how to get around this? 关于如何解决这个问题的任何提示? Or should I keep indexing it myself? 或者我应该自己编制索引吗? If so, should I be doing more than simply providing a text file of keywords? 如果是这样,我应该做的不仅仅是提供关键字的文本文件吗?

3) Whoosh only return a result if the keyword matches itself exactly. 3)如果关键字与其自身完全匹配,则Whoosh仅返回结果。 If a user has 'mathematics' as his keyword, and I search 'math', I want that user to appear. 如果用户将“数学”作为他的关键词,并且我搜索“数学”,我希望该用户出现。 I couldn't definitively tell whether Xapian or Solr support this. 我无法明确告诉Xapian或Solr是否支持这一点。 Thoughts? 思考?

Thanks for any suggestion. 谢谢你的任何建议。 I'm going to continue digging into this myself for the time being. 我将暂时继续深入研究这个问题。

Unfortunately I don't know enough to answer your other questions, however for point 3.) Whoosh actually does support this. 不幸的是,我不知道回答你的其他问题,但是对于第3点。)哎呀其实支持这个。

You would have to use the autocomplete function of SearchQuerySet. 您必须使用SearchQuerySet的自动完成功能。

Detailed here: http://docs.haystacksearch.org/dev/autocomplete.html 详细信息请访问: http//docs.haystacksearch.org/dev/autocomplete.html

I'm currently using Whoosh and matching on partial matches myself. 我目前正在使用Whoosh并在部分比赛中匹配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM