简体   繁体   中英

Unable to import large data into solr using DIH

I am trying to import large data using dih from mySql. Following is the datasource with batchSize =-1 for mySql

<dataSource batchSize="-1" driver="com.mysql.jdbc.Driver" .....  />

If fetches all 10 million records. But at the end says full import failed. I get the following exception in the log. :

2017-03-14 07:27:04.429 ERROR (Thread-14) [   x:companyData] o.a.s.h.d.DataImporter Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.sql.SQLException: Operation not allowed after ResultSet closed
    at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:270)
    at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
    at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:475)
    at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:458)
    at org.apache.solr.handler.dataimport.DataImporter$$Lambda$85/252359661.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.sql.SQLException: Operation not allowed after ResultSet closed
    at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:416)
    at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
    at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
    ... 5 more

Any help would be appreciated regardign the same.

The error you're facing does not concern Solr but the way you're accessing your database.

If you look at your exception: java.sql.SQLException: Operation not allowed after ResultSet closed .

I suggest to change batchSize parameter to a different value, for example 1000 .

The batchSize option is used to retrieve the rows of a database table in batches in order to reduce memory usage (it is often used to prevent running out of memory when running the data import handler). While a lower batch size might be slower, the option does not intend to affect the speed of the import process.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM