简体   繁体   English

如何使用lucene API搜索带有精确短语的内容?

[英]how to search content with exact phrase using lucene API?

Enter phrase for search: Adil Shahi dynasty 输入短语进行搜索:阿迪尔·沙希王朝

  1. Adil Shahi dynasty 阿迪尔·沙希王朝
  2. Qutb Shahi dynasty Qutb Shahi王朝
  3. Gohar Shahi templates Gohar Shahi模板

when I enter Adil Shahi dynasty it returns me many contents, I'm using lucene API and want to match the content with exact phrase code:for creating indexes 当我进入Adil Shahi朝代时,它返回了许多内容,我使用的是lucene API,并希望将内容与确切的词组代码匹配:用于创建索引

public static void main(String[] args) throws Exception{
     StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
     PhraseQuery query = new PhraseQuery();
    Directory index = FSDirectory.open(new File("/ttlfiles/indexes/category_labels_en"));
    BufferedReader br = new BufferedReader(
            new InputStreamReader(System.in));
    String querystr = br.readLine();
    while(!querystr.equals("q")){
    Query q = new QueryParser(Version.LUCENE_47, "spa", analyzer).parse(querystr);

    // 3. search
    int hitsPerPage = 10;
    IndexReader reader = DirectoryReader.open(index);
    IndexSearcher searcher = new IndexSearcher(reader);
    TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
    searcher.search(q, collector);
    ScoreDoc[] hits = collector.topDocs().scoreDocs;

    // 4. display results
    System.out.println("Found " + hits.length + " hits.");
    for(int i=0;i<hits.length;++i) {
      int docId = hits[i].doc;
      Document d = searcher.doc(docId);
      System.out.println((i + 1) + ". " + d.get("spa"));
    }//end of for loop
    querystr = br.readLine();
    }//while's end
}

@Gimby: Might be the user has selected the wrong code to search the content via Lucene. @Gimby:可能是用户选择了错误的代码来通过Lucene搜索内容。 You have to create the Lucene indexes first and then you will be able to search for the content. 您必须先创建Lucene索引,然后才能搜索内容。

Here is the code you can refer to for searching the content: 您可以参考以下代码来搜索内容:

public static void main(String[] args) throws Exception{
     StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
     //PhraseQuery query = new PhraseQuery();
    Directory index = FSDirectory.open(new File("/media/New Volume/ttlindexes"));
    BufferedReader br = new BufferedReader(
            new InputStreamReader(System.in));
    String querystr = br.readLine();
    while(!querystr.equals("q")){
        QueryParser parser = new QueryParser(Version.LUCENE_47,"spo",analyzer);
        parser.setDefaultOperator(QueryParser.Operator.OR);
        //parser.setPhraseSlop(0);
        Query query=parser.createPhraseQuery("spo",querystr);
    //Query q = new QueryParser(Version.LUCENE_47, "spa", analyzer).parse(querystr);

    // 3. search
    int hitsPerPage = 1000000;
    IndexReader reader = DirectoryReader.open(index);
    IndexSearcher searcher = new IndexSearcher(reader);
    TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
    searcher.search(query, collector);
    ScoreDoc[] hits = collector.topDocs().scoreDocs;

    // 4. display results
    System.out.println("Found " + hits.length + " hits.");
    for(int i=0;i<hits.length;++i) {
      int docId = hits[i].doc;
      Document d = searcher.doc(docId);
      System.out.println((i + 1) + ". " + d.get("spo"));
    }//end of for loop
    querystr = br.readLine();
    }//while's end
}

@Aadil : Thanks for guidance, I have used this after a bit changes for indexing the ttl files of dbpedia. @Aadil:感谢您的指导,在对dbpedia的ttl文件建立索引后,我已经使用了它。 You can download turtle files from this link http://wiki.dbpedia.org/Downloads39 and can get . 您可以从此链接http://wiki.dbpedia.org/Downloads39下载乌龟文件,并可以获取。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM