简体   繁体   中英

Django Haystack substring search

I have recently added search capabilities to my django-powered site to allow employers to search for employees using keywords. When the user initially uploads their resume, I turn it into text, get rid of stop words, and then add the text to a TextField for that user. I used Django-Haystack with the Whoosh search back engine.

Three things-

1) Aside from extra features which I'll probably not use, is there any concrete advantage to switching to Solr or Xapian?

2) In turning the resume into text, I essentially index the pdf myself. I know both Xapian and Solr support .pdf indexing, however, from the looks of it Haystack does not. Any tips on how to get around this? Or should I keep indexing it myself? If so, should I be doing more than simply providing a text file of keywords?

3) Whoosh only return a result if the keyword matches itself exactly. If a user has 'mathematics' as his keyword, and I search 'math', I want that user to appear. I couldn't definitively tell whether Xapian or Solr support this. Thoughts?

Thanks for any suggestion. I'm going to continue digging into this myself for the time being.

Unfortunately I don't know enough to answer your other questions, however for point 3.) Whoosh actually does support this.

You would have to use the autocomplete function of SearchQuerySet.

Detailed here: http://docs.haystacksearch.org/dev/autocomplete.html

I'm currently using Whoosh and matching on partial matches myself.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM