Lucene：如何索引文件名

Question

I'm newbie lucene user and trying to get some basics now. 我是新手Lucene用户，现在尝试获取一些基础知识。

I have three files: 我有三个文件：

apache_empty.txt (empty file), apache_empty.txt （空文件），
apache.txt (contains many of 'apache' tokens), apache.txt （包含许多'apache'令牌），
other.txt (contains just one token - 'apache' ) other.txt （仅包含一个令牌other.txt 'apache' ）

When I try to search 'apache' , I get only apache.txt and other.txt in result, but I wanna get even the apache_empty.txt file, which has the searched word in its name... 当我尝试搜索'apache' ， 我只得到 apache.txt和other.txt的结果，但是我什至想要得到apache_empty.txt文件，该文件的名称中包含搜索到的单词...

And that's how I add documents to the index: 这就是我将文档添加到索引的方式：

protected Document getDocument(File f) throws Exception 
{
  Document doc   = new Document();
  Field contents = new Field("contents", new FileReader(f));
  Field parent   = new Field("parent",   f.getParent(), Field.Store.YES, Field.Index.NOT_ANALYZED);
  Field filename = new Field("filename", f.getName(), Field.Store.YES, Field.Index.ANALYZED);
  Field fullpath = new Field("fullpath", f.getCanonicalPath(), Field.Store.YES, Field.Index.NOT_ANALYZED);
  filename.setBoost(2.0F);
  doc.add(contents);
  doc.add(parent);
  doc.add(filename);
  doc.add(fullpath);
  return doc;
}

How to let the lucene index also file names? 如何让Lucene索引也使用文件名？

Answer 1

To enable wildcards you should search for apache* which would also match your filename apache_empty for the complete syntax see also Apache Lucene Query Parser . 要启用通配符，您应该搜索apache* ，该apache*也要与文件名apache_empty匹配以获取完整的语法，另请参阅Apache Lucene查询解析器。

An alternative would be to include the underscore as a word separator in the used analyzer. 另一种选择是在所使用的分析器中包括下划线作为单词分隔符。

Lucene：如何索引文件名

问题描述

1 个解决方案

解决方案1
6 已采纳 2012-09-26 11:16:08

Lucene：如何索引文件名

问题描述

1 个解决方案

解决方案1 6 已采纳 2012-09-26 11:16:08

解决方案1
6 已采纳 2012-09-26 11:16:08