简体   繁体   English

使用dataimporthandler solr导入csv

[英]csv import with dataimporthandler solr

I am trying to use solr with DIH to index csv files. 我正在尝试使用带有DIH的solr来索引csv文件。 I've patched my DIH library using patch SOLR-2549 mentioned on the solr wiki (see http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1 ) in order to import csv files without using Transformers along with LineEntityProcessor. 我已使用solr Wiki上提到的补丁SOLR-2549修补了DIH库(请参阅http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1 ),以便在不使用Transformers的情况下导入csv文件。以及LineEntityProcessor。

Unfortunately, I could not get my import work and I have the following error stack: 不幸的是,我无法完成导入工作,并且出现以下错误堆栈:

INFO: [csv] webapp=/solr path=/dataimport params={command=full-import&optimize=false&clean=true&commit=true&verbose=true} status=0 QTime=33 {deleteByQuery=*:*} 0 33
7 nov. 2012 14:16:03 org.apache.solr.common.SolrException log
GRAVE: Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:273)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:413)
        at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:326)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:234)
        ... 3 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:542)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:411)
        ... 5 more
Caused by: java.lang.NullPointerException
        at org.apache.solr.handler.dataimport.LineEntityProcessor.initDelimitedOrFixedWidth(LineEntityProcessor.java:142)
        at org.apache.solr.handler.dataimport.LineEntityProcessor.init(LineEntityProcessor.java:115)
        at org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:74)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:430)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:498)
        ... 6 more

I think it's related to my data configuration. 我认为这与我的数据配置有关。 This is my data-config.xml file: 这是我的data-config.xml文件:

<dataConfig>
    <dataSource name="dfs" type="FileDataSource"/>
    <document>
        <entity name="sourcefile"
                processor="FileListEntityProcessor"
                fileName="rocinter.csv"
                rootEntity="false"
                baseDir="/user/xxx/work/solr/example/example-DIH/solr/csv/inputfolder"
        >

            <entity name="entryline"
                    processor="LineEntityProcessor"
                    url="${sourcefile.fileAbsolutePath}"
                    rootEntity="true"
                    dataSource="fds"
                    separator=","
            >
            </entity>
        </entity>
    </document>
</dataConfig>

Could anybody help me undestand this issue or provide a clear config file using patched LineEntityProcessor version to import csv files ? 有人可以帮助我理解这个问题,或者使用修补的LineEntityProcessor版本提供清晰的配置文件来导入csv文件吗?

I'v finally got an answer from the user mailing list. 我终于从用户邮件列表中得到了答案。 Actually that was a bug in the patch. 实际上,那是补丁中的错误。

A newer version of the patch is attached to jira issue. 修补程序的更新版本已随附在jira问题上。

see: SOLR-2549 参见: SOLR-2549

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM