简体   繁体   中英

how to search content with exact phrase using lucene API?

Enter phrase for search: Adil Shahi dynasty

  1. Adil Shahi dynasty
  2. Qutb Shahi dynasty
  3. Gohar Shahi templates

when I enter Adil Shahi dynasty it returns me many contents, I'm using lucene API and want to match the content with exact phrase code:for creating indexes

public static void main(String[] args) throws Exception{
     StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
     PhraseQuery query = new PhraseQuery();
    Directory index = FSDirectory.open(new File("/ttlfiles/indexes/category_labels_en"));
    BufferedReader br = new BufferedReader(
            new InputStreamReader(System.in));
    String querystr = br.readLine();
    while(!querystr.equals("q")){
    Query q = new QueryParser(Version.LUCENE_47, "spa", analyzer).parse(querystr);

    // 3. search
    int hitsPerPage = 10;
    IndexReader reader = DirectoryReader.open(index);
    IndexSearcher searcher = new IndexSearcher(reader);
    TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
    searcher.search(q, collector);
    ScoreDoc[] hits = collector.topDocs().scoreDocs;

    // 4. display results
    System.out.println("Found " + hits.length + " hits.");
    for(int i=0;i<hits.length;++i) {
      int docId = hits[i].doc;
      Document d = searcher.doc(docId);
      System.out.println((i + 1) + ". " + d.get("spa"));
    }//end of for loop
    querystr = br.readLine();
    }//while's end
}

@Gimby: Might be the user has selected the wrong code to search the content via Lucene. You have to create the Lucene indexes first and then you will be able to search for the content.

Here is the code you can refer to for searching the content:

public static void main(String[] args) throws Exception{
     StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
     //PhraseQuery query = new PhraseQuery();
    Directory index = FSDirectory.open(new File("/media/New Volume/ttlindexes"));
    BufferedReader br = new BufferedReader(
            new InputStreamReader(System.in));
    String querystr = br.readLine();
    while(!querystr.equals("q")){
        QueryParser parser = new QueryParser(Version.LUCENE_47,"spo",analyzer);
        parser.setDefaultOperator(QueryParser.Operator.OR);
        //parser.setPhraseSlop(0);
        Query query=parser.createPhraseQuery("spo",querystr);
    //Query q = new QueryParser(Version.LUCENE_47, "spa", analyzer).parse(querystr);

    // 3. search
    int hitsPerPage = 1000000;
    IndexReader reader = DirectoryReader.open(index);
    IndexSearcher searcher = new IndexSearcher(reader);
    TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
    searcher.search(query, collector);
    ScoreDoc[] hits = collector.topDocs().scoreDocs;

    // 4. display results
    System.out.println("Found " + hits.length + " hits.");
    for(int i=0;i<hits.length;++i) {
      int docId = hits[i].doc;
      Document d = searcher.doc(docId);
      System.out.println((i + 1) + ". " + d.get("spo"));
    }//end of for loop
    querystr = br.readLine();
    }//while's end
}

@Aadil : Thanks for guidance, I have used this after a bit changes for indexing the ttl files of dbpedia. You can download turtle files from this link http://wiki.dbpedia.org/Downloads39 and can get .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM