简体   繁体   中英

Solr - indexing large database

I'd like to index database (MySQL) in Solr. Database has one table, but it has 50 columns and almost 4 milions rows. It's around 1.5GB.

I configured solrconfig.xlm , solr-data-config.xml , and in schema.xml I've added:

<dynamicField name="*"  type="text_general"   multiValued="false" indexed="true"  stored="true" />

Because every fields are text.

When I try to import data it take a few seconds and nothing happens. I got

Requests: 1, Fetched: 0, Skipped: 0, Processed: 0

There is a error in logs:

java.sql.SQLException: Unexpected exception encountered during query. at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1094) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:997) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:983) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:928) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2866) at com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:5191) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:5074) at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4667) at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1640) at org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:484) at org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:469) at org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:288) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:277) at org.apache.solr.handler.dataimport.D ataImporter.doFullImport(DataImporter.java:416) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461) Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 at com.mysql.jdbc.Buffer.readFieldLength(Buffer.java:289) at com.mysql.jdbc.Buffer.fastSkipLenString(Buffer.java:170) at com.mysql.jdbc.MysqlIO.unpackField(MysqlIO.java:708) at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:428) at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3222) at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2393) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2816) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2820) ... 11 more

I've tried with small database and it worked, every fields have indexed. I have problem only with large base.

I changed ramBufferSizeMB and maxBufferedDocs in solrconfig.xlm to 2GB and 4GB, but it doesn't help. I have no idea what's wrong.

Try with a different batchSize setting.

From the FAQ :

DataImportHandler is designed to stream row one-by-one. It passes a fetch size value (default: 500) to Statement#setFetchSize which some drivers do not honor. For MySQL, add batchSize property to dataSource configuration with value -1. This will pass Integer.MIN_VALUE to the driver as the fetch size and keep it from going out of memory for large tables.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM