简体   繁体   中英

Solr DIH with SQL

I am trying to setup my SOLR to import documents from my SQL file. I have found this which should go in the data-config:

<dataConfig>
  <dataSource driver="org.postgresql.Driver" url="jdbc:postgresql://localhost:5432/wikipedia" user="wikipedia" password="secret" />
  <document>
    <entity name="page" query="SELECT page_id, page_title from page">
      <field column="page_id" name="id" />
      <field column="page_title" name="name" />
      <entity name="revision" query="select rev_id from revision where rev_page=${page.page_id}">
        <entity name="pagecontent" query="select old_text from pagecontent where old_id=${revision.rev_id}">
          <field column="old_text" name="text" />
        </entity>
      </entity>
   </entity>
  </document>
</dataConfig>

In my case, my schema looks like this:

CREATE TABLE country (
    id integer NOT NULL PRIMARY KEY AUTO_INCREMENT,
    name varchar(255) NOT NULL
)
;

CREATE TABLE location (
    id integer NOT NULL PRIMARY KEY AUTO_INCREMENT,
    name varchar(255) NOT NULL,
    coordinate varchar(255) NOT NULL,
    country_id integer NOT NULL REFERENCES country (id)
)
;

CREATE TABLE item (
    id integer NOT NULL PRIMARY KEY AUTO_INCREMENT,
    title varchar(60) NOT NULL,
    description varchar(900) NOT NULL,
    date datetime NOT NULL,
    source varchar(255) NOT NULL,
    link varchar(255) NOT NULL,
    location_id integer NOT NULL REFERENCES location (id)
)
;

If I want to import the following fields into Solr:

id
title
description
date
source
link
location(name)
location(co-ordinates)

Can someone please help me on my way to change the example data-config to use my data. What I am confused about are when to use "entity" and when to use "field column".

You can do it two ways eg :-

Create a Simple SQL query with Join between Item and Location

<document name="items">
    <entity name="item" query="SELECT A.ID, A.TITLE, A.DESCRIPTION, A.DATE, A.SOURCE, B.COORDINATE , C.NAME FROM ITEM A, LOCATION B, COUNTRY C WHERE A.LOCATION_ID = B.ID AND B.COUNTRY_ID = C.ID">
        <field column="ID" name="id" />
        <field column="TITLE" name="title" />
        <field column="DESCRIPTION" name="description" />
        <field column="DATE" name="date" />
        <field column="SOURCE" name="source" />
        <field column="COORDINATE" name="coordinate" />     

    </entity>
</document>

Using subentities :-

<document name="items">
    <entity name="item" query="SELECT A.ID, A.TITLE, A.DESCRIPTION, A.DATE, A.SOURCE, B.COORDINATE , C.NAME FROM ITEM A, LOCATION B, COUNTRY C WHERE A.LOCATION_ID = B.ID AND B.COUNTRY_ID = C.ID">
        <field column="ID" name="id" />
        <field column="TITLE" name="title" />
        <field column="description" name="description" />
        <field column="DATE" name="date" />
        <field column="SOURCE" name="source" />

        <entity name="location" query="select coordinate from location where id='${item.ID}'">
            <field name="coordinate" column="coordinate" />
        </entity>

    </entity>
</document>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM