[英]extracting all fields from a Lucene8 index
Given an index created with Lucene-8, but without knowledge of the field
s used, how can I programmatically extract all the fields?给定一个使用 Lucene-8 创建的索引,但不知道所使用的
field
,我如何以编程方式提取所有字段? (I'm aware that the Luke browser can be used interactively (thanks to @andrewjames) Examples for using latest version of Lucene . ) The scenario is that, during a development phase, I have to read indexes without prescribed schemas. (我知道可以交互式地使用 Luke 浏览器(感谢@andrewjames) 使用最新版本 Lucene 的示例。)场景是,在开发阶段,我必须读取没有规定模式的索引。 I'm using
我在用着
IndexReader reader = DirectoryReader.open(FSDirectory.open(Paths.get(index)));
IndexSearcher searcher = new IndexSearcher(reader);
The reader
has methods such as: reader
有以下方法:
reader.getDocCount(field);
but this requires knowing the fields in advance.但这需要提前了解这些领域。
I understand that documents in the index may be indexed with different fields;我了解索引中的文档可能会使用不同的字段进行索引; I'm quite prepared to iterate over all documents and extract the fields on a regular basis (these indexes are not huge).
我已经准备好遍历所有文档并定期提取字段(这些索引并不大)。
I'm using Lucene 8.5.* so post and tutorials based on earlier Lucene versions may not work.我正在使用 Lucene 8.5.* 所以基于早期 Lucene 版本的帖子和教程可能无法正常工作。
You can access basic field info as follows:您可以按如下方式访问基本字段信息:
import java.util.List;
import java.io.IOException;
import java.nio.file.Paths;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexableField;
import org.apache.lucene.store.FSDirectory;
public class IndexDataExplorer {
private static final String INDEX_PATH = "/path/to/index/directory";
public static void doSearch() throws IOException {
IndexReader reader = DirectoryReader.open(FSDirectory.open(Paths.get(INDEX_PATH)));
for (int i = 0; i < reader.numDocs(); i++) {
Document doc = reader.document(i);
List<IndexableField> fields = doc.getFields();
for (IndexableField field : fields) {
// use these to get field-related data:
//field.name();
//field.fieldType().toString();
}
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.