简体   繁体   中英

Solr DIH delta-import with compound primary keys?

My Solr data source is a SQL database where the primary key is compound (ie it's two fields).

This is fine for my main DIH query , I just concatenate the fields and that becomes my Solr primary key. However it's unclear from the documentation how I'd write a delta-import query to support this.

The documentation suggests I need two queries - one to find the primary key of the changed rows, and another to then actually retrieve the individual documents corresponding to each of those keys. There's no example showing this for compound keys though.

Ideally I don't want those two separate queries at all, it would put less load on the database if those two queries were simply combined such that the only difference between query and deltaQuery is the WHERE clause that filters based on last_changed .

So, if my main query is:

SELECT key1 || key2 as pk FROM table

What would the relevant deltaQuery (and/or deltaImportQuery ) look like?

I tried just adding the WHERE clause but after the query ran I got a warning about the missing deltaImportQuery and then a null-pointer exception.

query="SELECT key1 || key2 as id, ...other fields FROM table"

deltaImportQuery="SELECT key1 || key2 as id, ... other fields
                  FROM table
                  where key1 = '${dataimporter.delta.key1}'
                  and key2 = '${dataimporter.delta.key2}'"

deltaQuery="SELECT key1 || key2 as id, key1, key2
            FROM table
            WHERE lastUpdated > '${dataimporter.last_index_time}'"

Assuming key1 and key2 are text. The single quotes around ${dataimporter.delta.key2} wouldn't be needed if key2 is numeric for example.

Set your deltaQuery to "select 1" which will trigger the deltaImportQuery then just write your deltaImportQuery with the '${dataimporter.last_index_time}' in the where clause

so deltaQuery="select 1" deltaImportQuery="select * from a_table where lastUpdated > '${dataimporter.last_index_time}'"

There are two queries for deltaImport. First one(deltaQuery) is for determining what to index. For example, in it we can define what ID we need to index. The other one is for determining data from this ids. Look at my example, hope, it`ll help you:

<entity name="address" pk="address_id" query="SELECT * FROM address a" deltaImportQuery="SELECT * FROM address a where a.address_id > ${dataimporter.delta.id}"
            deltaQuery="select address_id as id from address where address_id=101010">

Important part of deltaImportQuery is ${dataimporter.delta.id}. This is how we set our id from deltaQuery to deltaImportQuery.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM