简体   繁体   中英

Document and Field instance reuse in Lucene Indexing

I am trying to reuse the Document and Field instances to improve the performance (I have tried this for 1 million rows in the file without reusing the instances it was taking 20 seconds) .

But when I try to do that it takes too much time and it keeps running.

Can anyone faced the same problem before?

This is the existing code before trying to reuse the instances, for each line in the file I was creating new document and fields.

FileInputStream fis;
                try {

                    fis = new FileInputStream(file);
                    String filePath= file.getPath();
                    BufferedReader br = new BufferedReader(
                            new InputStreamReader(fis, StandardCharsets.UTF_8));

                    String line = null;
                    while ((line = br.readLine()) != null) {

                        String[] lineTokens = line.split("\\|");
                        Document doc = new Document();
                        Field field1 = new TextField("field1", field1Value, Field.Store.YES);
                        doc.add(field1);
                        Field field2 = new StringField("field2", field2Value,Field.Store.YES);
                        doc.add(field2);
                        writer.addDocument(doc);
                    }
                    br.close();
                } catch (FileNotFoundException fnfe) {

                } 

After changing

FileInputStream fis;
                try {

                    fis = new FileInputStream(file);
                    String filePath= file.getPath();
                    BufferedReader br = new BufferedReader(
                            new InputStreamReader(fis, StandardCharsets.UTF_8));

                    String line = null;
                    Document doc = new Document();
                    Field field1 = new TextField("field1", field1Value, Field.Store.YES);
                    Field field2 = new StringField("field2", field2Value,Field.Store.YES);
                    while ((line = br.readLine()) != null) {

                        //String[] lineTokens = line.split("\\|");

                        field1.setStringValue("field1Value");
                        doc.add(field1);

                        field2.setStringValue("field2Value");
                        doc.add(field2);
                        writer.addDocument(doc);
                    }
                    br.close();
                } catch (FileNotFoundException fnfe) {

                } 

You don't need to add the fields to the doc on every iteration. After you have added the fields once, all you need to do is change the field values, and then write the altered document to the index, like this:

Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
doc.add(field1);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
doc.add(field2);
while ((line = br.readLine()) != null) {
    field1.setStringValue("field1Value");
    field2.setStringValue("field2Value");

    writer.addDocument(doc);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM