TermQuery 没有给出预期的结果作为 QueryParser - Lucene 7.4.0

Question

I am indexing 10 text documents using StandardAnalyser.我正在使用 StandardAnalyser 索引 10 个文本文档。

public static void indexDoc(final IndexWriter writer, Path filePath, long timstamp)
    {
        try (InputStream iStream = Files.newInputStream(filePath))
        {
            Document doc = new Document();

            Field pathField = new StringField("path",filePath.toString(),Field.Store.YES);
            Field flagField = new TextField("ashish","i am stored",Field.Store.YES);
            LongPoint last_modi = new LongPoint("last_modified",timstamp);
            Field content = new TextField("content",new BufferedReader(new InputStreamReader(iStream,StandardCharsets.UTF_8)));

            doc.add(pathField);
            doc.add(last_modi);
            doc.add(content);
            doc.add(flagField);

            if(writer.getConfig().getOpenMode()==OpenMode.CREATE)
            {
                System.out.println("Adding "+filePath.toString());
                writer.addDocument(doc);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }



    }

above is the code snippet used to index a document.上面是用于索引文档的代码片段。 for testing purpose, i am searching a field called as 'ashish'.出于测试目的，我正在搜索一个名为“ashish”的字段。

When I use QueryParser, Lucene gives the search results as expected.当我使用 QueryParser 时，Lucene 会按预期提供搜索结果。

public static void main(String[] args) throws Exception
    {
        String index = "E:\\Lucene\\Index";
        String field = "ashish";
        int hitsPerPage = 10;

        IndexReader reader = DirectoryReader.open(FSDirectory.open(Paths.get(index)));
        IndexSearcher searcher = new IndexSearcher(reader);
        Analyzer analyzer = new StandardAnalyzer();

        QueryParser parser = new QueryParser(field, analyzer);

        String line = "i am stored";

        Query query = parser.parse(line);
      //  Query q = new TermQuery(new Term("ashish","i am stored"));
        System.out.println("Searching for: " + query.toString());



        TopDocs results = searcher.search(query, 5 * hitsPerPage);
        ScoreDoc[] hits = results.scoreDocs;

        int numTotalHits = Math.toIntExact(results.totalHits);
        System.out.println(numTotalHits + " total matching documents");

        for(int i=0;i<numTotalHits;i++)
        {
             Document doc = searcher.doc(hits[i].doc);
             String path = doc.get("path");
             String content = doc.get("ashish");
             System.out.println(path+"\n"+content);

        }



    }

above code demonstrates the use of QueryParser to retrieve the desired field, which works properly.上面的代码演示了使用 QueryParser 来检索所需的字段，它工作正常。 it hits all 10 documents, as i am storing this field for all 10 documents.它命中所有 10 个文档，因为我正在为所有 10 个文档存储此字段。 all good here.这里一切都很好。

however when I use TermQuery API, I don't get the desired result.但是，当我使用 TermQuery API 时，我没有得到想要的结果。 I am presenting the code change that I did for TermQuery.我正在展示我为 TermQuery 所做的代码更改。

public static void main(String[] args) throws Exception
    {
        String index = "E:\\Lucene\\Index";
        String field = "ashish";
        int hitsPerPage = 10;

        IndexReader reader = DirectoryReader.open(FSDirectory.open(Paths.get(index)));
        IndexSearcher searcher = new IndexSearcher(reader);
        Analyzer analyzer = new StandardAnalyzer();

      //  QueryParser parser = new QueryParser(field, analyzer);

        String line = "i am stored";

       // Query query = parser.parse(line);
       Query q = new TermQuery(new Term("ashish","i am stored"));
        System.out.println("Searching for: " + q.toString());

        TopDocs results = searcher.search(q, 5 * hitsPerPage);
        ScoreDoc[] hits = results.scoreDocs;

        int numTotalHits = Math.toIntExact(results.totalHits);
        System.out.println(numTotalHits + " total matching documents");

        for(int i=0;i<numTotalHits;i++)
        {
             Document doc = searcher.doc(hits[i].doc);
             String path = doc.get("path");
             String content = doc.get("ashish");
             System.out.println(path+"\n"+content);
             System.out.println("----------------------------------------------------------------------------------");
        }



    }

also attaching the screenshot of TermQuery API execution.还附上了 TermQuery API 执行的截图。

did some research on stackoverflow itself example Lucene TermQuery and QueryParser but did not find any practical solution also the lucene version was very old in those examples.对 stackoverflow 本身进行了一些研究，例如Lucene TermQuery 和 QueryParser，但没有找到任何实用的解决方案，而且这些示例中的 lucene 版本也很旧。

would appreciate a help.将不胜感激的帮助。

thanks in advance!提前致谢！

Answer 1

I got the answer of my question in this post link that explains how TermQuery works我在这篇文章链接中得到了我的问题的答案，该链接解释了 TermQuery 的工作原理

TermQuery searches for entire String as it is. TermQuery 按原样搜索整个字符串。 this behavior will give you improper results as while indexing data is often tokenized.这种行为会给您带来不正确的结果，因为索引数据通常被标记化。

in the posted code, I was passing entire search String to TermQuery like在发布的代码中，我将整个搜索字符串传递给 TermQuery 就像
Query q = new TermQuery(new Term("ashish","i am stored"));查询 q = new TermQuery(new Term("ashish","i am stored"));
now in above case, Lucene is finding "i am stored" as it is, which it will never find because in indexing this string is tokenized.现在在上面的例子中，Lucene 发现“我被存储了”，它永远找不到，因为在索引这个字符串时被标记了。
instead I tried to search like Query q = new TermQuery(new Term("ashish","stored"));相反，我尝试搜索Query q = new TermQuery(new Term("ashish","stored"));
Above query gave me an expected results.上面的查询给了我一个预期的结果。

thanks, Ashish谢谢，阿希什

Answer 2

The real problem is your query string is not getting analyzed here.真正的问题是这里没有分析您的查询字符串。 So, use same analyzer as used while indexing document and try using below code to analyze query string and then search.因此，使用与索引文档时使用的分析器相同的分析器，并尝试使用以下代码来分析查询字符串，然后进行搜索。

IndexReader reader = DirectoryReader.open(FSDirectory.open(Paths.get(index)));
IndexSearcher searcher = new IndexSearcher(reader);

QueryParser parser = new QueryParser("ashish", analyzer);
Query query = new TermQuery(new Term("ashish", "i am stored"));
query = parser.parse(query.toString());
ScoreDoc[] hits = searcher.search(query, 5).scoreDocs;

TermQuery 没有给出预期的结果作为 QueryParser - Lucene 7.4.0

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-07-15 08:53:48

解决方案2
0 2019-10-08 07:51:01

TermQuery 没有给出预期的结果作为 QueryParser - Lucene 7.4.0

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-07-15 08:53:48

解决方案2 0 2019-10-08 07:51:01

解决方案1
1 已采纳 2018-07-15 08:53:48

解决方案2
0 2019-10-08 07:51:01