简体   繁体   English

使用Lucene进行精确短语搜索?

[英]Exact Phrase search using Lucene?

I am using SpanTerm Query for searching exact phrase in lucene. 我正在使用SpanTerm Query来搜索lucene中的确切短语。 But it doesnt seem to work. 但它似乎没有用。 Here is my code. 这是我的代码。

Indexing 索引

IndexWriter writer = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_30), false,IndexWriter.MaxFieldLength.UNLIMITED);  
doc.add(new Field("contents", sb.toString(), Field.Store.YES, Field.Index.ANALYZED,Field.TermVector.WITH_POSITIONS_OFFSETS));
doc.add(new Field("imageid", imageDocument.getImageId(), Field.Store.YES, Field.Index.NOT_ANALYZED));
doc.add(new Field("title", imageDocument.getTitle(), Field.Store.YES, Field.Index.ANALYZED));
doc.add(new Field("country", imageDocument.getCountry(), Field.Store.YES, Field.Index.NOT_ANALYZED));
write.addDocument(doc);

Searching 搜索

String sentence = searchParameters.get("searchExactWord");
String[] words = sentence.split(" ");
String queryNoWord = "";
int i = 0;
SpanTermQuery [] clause = new SpanTermQuery[words.length];
for (String word : words)
{
    clause[i] = new SpanTermQuery(new Term("contents",word));
    i++;
}
SpanNearQuery query = new SpanNearQuery(clause, 0, true);
booleanQuery.add(query, BooleanClause.Occur.MUST);

Please guide me if I am doing it wrong??? 如果我做错了,请指导我???

Prateek Prateek

Try a PhraseQuery instead: 请尝试使用PhraseQuery

PhraseQuery query = new PhraseQuery();
String[] words = sentence.split(" ");
for (String word : words) {
    query.add(new Term("contents", word));
}
booleanQuery.add(query, BooleanClause.Occur.MUST);

Edit: I think you have a different problem. 编辑:我认为你有一个不同的问题。 What other parts are there to your booleanQuery ? 您的booleanQuery还有哪些其他部分? Here's a full working example of searching for a phrase: 这是一个搜索短语的完整工作示例:

public class LucenePhraseQuery {
    public static void main(String[] args) throws Exception {
        // setup Lucene to use an in-memory index
        Directory directory = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
        MaxFieldLength mlf = MaxFieldLength.UNLIMITED;
        IndexWriter writer = new IndexWriter(directory, analyzer, true, mlf);

        // index a few documents
        writer.addDocument(createDocument("1", "foo bar baz"));
        writer.addDocument(createDocument("2", "red green blue"));
        writer.addDocument(createDocument("3", "test foo bar test"));
        writer.close();

        // search for documents that have "foo bar" in them
        String sentence = "foo bar";
        IndexSearcher searcher = new IndexSearcher(directory);
        PhraseQuery query = new PhraseQuery();
        String[] words = sentence.split(" ");
        for (String word : words) {
            query.add(new Term("contents", word));
        }

        // display search results
        TopDocs topDocs = searcher.search(query, 10);
        for (ScoreDoc scoreDoc : topDocs.scoreDocs) {
            Document doc = searcher.doc(scoreDoc.doc);
            System.out.println(doc);
        }
    }

    private static Document createDocument(String id, String content) {
        Document doc = new Document();
        doc.add(new Field("id", id, Store.YES, Index.NOT_ANALYZED));
        doc.add(new Field("contents", content, Store.YES, Index.ANALYZED,
                Field.TermVector.WITH_POSITIONS_OFFSETS));
        return doc;
    }
}

Use Lucene Query Builder, and give double quotes around the search string. 使用Lucene Query Builder,并在搜索字符串周围加上双引号。 It works for exact phrase search. 它适用于精确短语搜索。

Reference: http://www.lucenetutorial.com/lucene-query-builder.html 参考: http//www.lucenetutorial.com/lucene-query-builder.html

For version 4.6.0 Indexing: 对于版本4.6.0索引:

IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_46,analyzer);
try {
       IndexWriter iwriter=new IndexWriter(mDir,config);
       iwriter.deleteAll();
       iwriter.commit();
       Document doc = new Document();

       doc.add(new Field(myfieldname,text,TextField.TYPE_STORED));
       iwriter.addDocument(doc);
       iwriter.close();
}

Searching for exact phrase (given in variable keyword): 搜索确切的短语(以变量关键字给出):

DirectoryReader ireader=DirectoryReader.open(mDir);
IndexSearcher isearcher=new IndexSearcher(ireader);
QueryParser parser = new QueryParser(Version.LUCENE_46,myfieldname,analyzer);
parser.setDefaultOperator(QueryParser.Operator.AND);
parser.setPhraseSlop(0);
Query query=parser.createPhraseQuery(myfieldname,keyword);
ScoreDoc[] hits=isearcher.search(query, null, 1000).scoreDocs;
nret=hits.length;
ireader.close();

Note for the use of "setPhraseSlop(0) and createPhraseQuery() 注意使用“setPhraseSlop(0)和createPhraseQuery()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM