[英]problems when Improve Lucene Index Performance by reuse Document and Field Instance
[英]Document and Field instance reuse in Lucene Indexing
我正在尝试重用Document和Field实例以提高性能(我已经尝试对文件中的100万行执行此操作,而没有重用耗时20秒的实例)。
但是,当我尝试这样做时,它花费了太多时间,并且一直在运行。
有人可以面对同样的问题吗?
这是尝试重用实例之前的现有代码,对于我正在创建新文档和字段的文件中的每一行。
FileInputStream fis;
try {
fis = new FileInputStream(file);
String filePath= file.getPath();
BufferedReader br = new BufferedReader(
new InputStreamReader(fis, StandardCharsets.UTF_8));
String line = null;
while ((line = br.readLine()) != null) {
String[] lineTokens = line.split("\\|");
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
doc.add(field1);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
doc.add(field2);
writer.addDocument(doc);
}
br.close();
} catch (FileNotFoundException fnfe) {
}
变更后
FileInputStream fis;
try {
fis = new FileInputStream(file);
String filePath= file.getPath();
BufferedReader br = new BufferedReader(
new InputStreamReader(fis, StandardCharsets.UTF_8));
String line = null;
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
while ((line = br.readLine()) != null) {
//String[] lineTokens = line.split("\\|");
field1.setStringValue("field1Value");
doc.add(field1);
field2.setStringValue("field2Value");
doc.add(field2);
writer.addDocument(doc);
}
br.close();
} catch (FileNotFoundException fnfe) {
}
您不需要每次迭代都将字段添加到文档中。 一次添加字段后,只需更改字段值,然后将更改后的文档写入索引,如下所示:
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
doc.add(field1);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
doc.add(field2);
while ((line = br.readLine()) != null) {
field1.setStringValue("field1Value");
field2.setStringValue("field2Value");
writer.addDocument(doc);
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.