简体   繁体   English

Solr 增量导入擦除索引

[英]Solr delta-import erases index

I'm having trouble with Solr delta-import from MySQL database.我在使用来自 MySQL 数据库的 Solr delta-import 时遇到问题。 I am able to do a full import no problem.我可以完全导入没问题。 When I try to do delta-import, it imports the changed records (as expected), but wipes out the rest of the index, so that only the updated records are in the index.当我尝试执行增量导入时,它会导入更改的记录(如预期),但会清除索引的其余部分,以便只有更新的记录在索引中。 There are no errors in the log.日志中没有错误。 Am I missing something in my configuration?我的配置中是否缺少某些内容? Running Solr 5.4 on Ubuntu server and using the admin UI.在 Ubuntu 服务器上运行 Solr 5.4 并使用管理 UI。

<dataConfig>
    <dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/ibnet" user="xxxx" password="xxxxx" />
    <document>
    <entity name="profile" pk="profile.id" query="
        SELECT 
            profile.id AS id,
            profile.profile_status AS profile_status,
            //
            // Other fields
            //
            linkedProfile.org_name AS linked_org_name,
            linkedProfile.org_city AS linked_org_city,
            linkedProfile.org_st_prov_reg AS linked_org_st_prov_reg,
            linkedProfile.org_country AS linked_org_country
        FROM profile AS profile
        LEFT JOIN profile AS linkedProfile ON linkedProfile.id = profile.linked_id" 
        deltaImportQuery="
            SELECT 
                profile.id AS id,
                profile.profile_status AS profile_status,
                //
                // Other fields
                //
                linkedProfile.org_name AS linked_org_name,
                linkedProfile.org_city AS linked_org_city,
                linkedProfile.org_st_prov_reg AS linked_org_st_prov_reg,
                linkedProfile.org_country AS linked_org_country
            FROM profile AS profile
            LEFT JOIN profile AS linkedProfile ON linkedProfile.id = profile.linked_id
            WHERE profile.id = '${dih.delta.id}'"
        deltaQuery="SELECT profile.id FROM profile WHERE last_modified > '${dih.last_index_time}'"
        onError="skip" >
    </entity>
</document>

EDIT: I've changed dih.delta.id to dataimporter.delta.id and the same for last_index_time, but that hasn't changed the results.编辑:我已将 dih.delta.id 更改为 dataimporter.delta.id,last_index_time 也是如此,但这并没有改变结果。

Here is the response:这是回应:

{
  "responseHeader": {
    "status": 0,
    "QTime": 0
  },
  "initArgs": [
    "defaults",
    [
      "config",
      "data-config.xml"
    ]
  ],
  "command": "status",
  "status": "idle",
  "importResponse": "",
  "statusMessages": {
    "Total Requests made to DataSource": "4",
    "Total Rows Fetched": "6",
    "Total Documents Processed": "3",
    "Total Documents Skipped": "0",
    "Delta Dump started": "2016-05-01 02:38:03",
    "Identifying Delta": "2016-05-01 02:38:03",
    "Deltas Obtained": "2016-05-01 02:38:03",
    "Building documents": "2016-05-01 02:38:03",
    "Total Changed Documents": "3",
    "": "Indexing completed. Added/Updated: 3 documents. Deleted 0 documents.",
    "Committed": "2016-05-01 02:38:03",
    "Time taken": "0:0:0.317"
  }
}

In solr admin -> your core -> dataimport, there is a Clean option, if checked then it will clean data first before import (for both full-import and delta-import).在 solr admin -> your core -> dataimport 中,有一个Clean选项,如果选中,它将在导入之前先清理数据(对于完全导入和增量导入)。

Another tip is that, solr DIH always use UTC as the import timestamp, so what is your timezone?另一个提示是,solr DIH 始终使用UTC作为导入时间戳,那么您的时区是什么? Convert your datetime columns in database to utc first before compare it to the dih.last_index_time .先将数据库中的日期时间列转换为 utc,然后再将其与dih.last_index_time进行比较。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM