简体   繁体   中英

examine stripping out search words

I'm using umbraco and I have examine up and running however my query is having words stripped out

For example:

I am searching on "man on the moon" with the following line of code, the variable "searchTerm" should contain "man on the moon":

var Searcher = ExamineManager.Instance.SearchProviderCollection["MySearcher"];
var searchCriteria = Searcher.CreateSearchCriteria();

var query = searchCriteria.Field("Name", searchTerm).Compile();

however, the query is generated as this when I debug:

{ SearchIndexType: , LuceneQuery: +Name:"man moon" }

Notice how it has removed the words "on the" from the searchTerm?

Presumably these are because they are deemed as STOP/reserved words. However, this means I do not get the search results I expect.

How can I get around this?

Internally the StopAnalyzer class is used by the StandardAnalyzer as part of the standard indexing process. The StopAnalyzer ( http://lucenenet.apache.org/docs/3.0.3/d7/df5/_stop_analyzer_8cs_source.html#l00054 ) contains a method which allows you to substitute a different set of stopwords as an ISet type parameter rather than use the standard ENGLISH_STOP_WORDS_SET (line 134).

And I read here ( http://webcache.googleusercontent.com/search?q=cache:sA-uyAC015UJ:our.umbraco.org/m%3Fmode%3Dtopic%26id%3D25600+&cd=2&hl=en&ct=clnk&gl=uk ) that you can get Examine to use an empty set of stopwords by adding the following line to your application_start method in global.asax

Lucene.Net.Analysis.StopAnalyzer.ENGLISH_STOP_WORDS_SET = new System.Collections.Hashtable();

So with an empty set of stopwords your man in the moon should be back.

A bit of an odd idea but as an alternative you could also add a StopAnalyzer to ExamineSettings.config to create an index of docs with only the stop words and then AND them with your standardanalyzer result set?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM