簡體   English   中英

solr增量導入不適用於TikaEntityProcessor

[英]solr delta import not working with TikaEntityProcessor

我正在嘗試使用TikaEntityProcessor安排Delta導入。完全導入可以正常工作,但Delta導入沒有更新任何內容,也沒有錯誤。 顯示這么多服務器日志,我無法弄清楚出了什么問題:

121151 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Starting delta collection.
121155 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Running ModifiedRowKey() for Entity: message
121156 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Completed ModifiedRowKey for Entity: message rows obtained : 0
121156 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Completed DeletedRowKey for Entity: message rows obtained : 0
121156 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Completed parentDeltaQuery for Entity: message
121156 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Running ModifiedRowKey() for Entity: messages
121157 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.JdbcDataSource  û Creating a connection for entity messages with URL: jdbc:oracle:thin:@//172.16.29.92:1521/d11gr21
121176 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.JdbcDataSource  û Time taken for getConnection(): 19
121182 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Completed ModifiedRowKey for Entity: messages rows obtained : 1
121182 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Completed DeletedRowKey for Entity: messages rows obtained : 0

我的dataconfig.xml如下:

 <document>

  <entity name="messages" pk="BLOB_PK" transformer='DateFormatTransformer'
    query="select * from BLOB_TEST"
    deltaImportQuery="select * from BLOB_TEST where BLOB_PK='${dataimporter.delta.id}'"
    deltaQuery="select BLOB_PK from BLOB_TEST where to_char(last_modified,'YYYY-MM-DD HH24:MI:SS') &gt; '${dataimporter.last_index_time}' "
    dataSource="db">
   <field column ="BLOB_PK" name ="id" />
   <field column="last_modified"  dateTimeFormat="YYYY-MM-DD HH24:MI:SS" locale="en"    />
     <entity 
         name="message" 
         dataSource="dastream"
          processor="TikaEntityProcessor"
         url="message"
         dataField="messages.MESSAGE"
         format="text">

        <field column="text" name="mxMsg" blob="true" />
        </entity>
     </entity>

</document>

當我從Web客戶端手動運行Delta導入時,狀態顯示如下:

"statusMessages": { "Total Requests made to DataSource": "4", "Total Rows Fetched": "3", "Total Documents Skipped": "0", "Delta Dump started": "2013-12-16 14:48:28", "Identifying Delta": "2013-12-16 14:48:28", "Deltas Obtained": "2013-12-16 14:48:28", "Building documents": "2013-12-16 14:48:28", "Total Changed Documents": "3", "Total Documents Processed": "0", "Time taken": "0:0:0.50" }

我能夠使它工作。 我必須從data-config.xml中刪除以下內容:

deltaImportQuery =“ select * from BLOB_TEST其中BLOB_PK ='$ {dataimporter.delta.id}

我沒有為$ {dataimporter.delta.id}配置,所以可能是因為即使檢測到正確的添加行數也沒有索引。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM