简体   繁体   中英

Solr Schema Design

I have some questions regarding the solr schema design. Basically I'm setting up a search engine for product catalogue website and my table relationships are as follows.

  • Product Belongs to Merchant
  • Product Belongs to Brand
  • Product has and belongs to many Categories
  • Category has many Sub Categories
  • Sub Category has many Types
  • Type has many Sub Types

So far my Schema.xml is looks like this.

<field name="product_id" type="string" indexed="true" stored="true" required="true" /> 
<field name="name" type="string" indexed="true" stored="true"/>
<field name="merchant" type="string" indexed="true" stored="true"/>
<field name="merchant_id" type="string" indexed="true" stored="true"/>
<field name="brand" type="string" indexed="true" stored="true"/>
<field name="brand_id" type="string" indexed="true" stored="true"/>
<field name="categories" type="string" multiValued="true" indexed="true" stored="true"/>
<field name="sub_categories" type="string" multiValued="true" indexed="true" stored="true"/>
<field name="types" type="string" multiValued="true" indexed="true" stored="true"/>
<field name="sub_types" type="string" multiValued="true" indexed="true" stored="true"/>
<field name="price" type="float" indexed="true" stored="true"/>
<field name="description" type="text" indexed="true" stored="true"/>
<field name="image" type="text" indexed="true" stored="true"/>

<field name="text" type="text" indexed="true" stored="false" multiValued="true"/>

<uniqueKey>product_id</uniqueKey>

<defaultSearchField>text</defaultSearchField>

<solrQueryParser defaultOperator="OR"/>

<copyField source="name" dest="text"/>
<copyField source="merchant" dest="text"/>
<copyField source="brand" dest="text"/>
<copyField source="categories" dest="text"/>
<copyField source="sub_categories" dest="text"/>
<copyField source="types" dest="text"/>
<copyField source="sub_types" dest="text"/>

So my Questions now:

1) Is the Schema correct?

2) Let's assume I need to find products for Category XYZ . My Senior programer doesn't like querying the solr by Category Name , instead he wan't to use CategoryID . He is suggesting to store CategoryID_CategoryName (1001_Category XYZ) and from web front he is sending ID. (Assuming that Names with white spaces doesn't work properly).

So to find the products I should then do a partial match of categories and identify the category id from the string ie (fetch 1001 from 1001_Category XYZ) or What if I keep the Names on categories field and setup another field for category_ids ? that's seems a better option for me.

or

is there any Solr multi valued field type to store CategoryID and CategoryName together?

Let me know your thoughts, thanks.

Answers to your questions.

  1. Maybe - it depends on how you plan on structuring your queries, what you intend to search and what you intend to retrieve in search results. In your schema, you're storing & indexing everything which can be quite inefficient. Index what you intend to query, store what you intend to retrieve/display. If you were looking for optimizations, I would review the datatypes used in the schema - try to stay as native to the source type as you can.
  2. Querying by CategoryId - your programmer is correct, you want to query by category Id. Your approach of storing Ids and Names in separate fields is accurate as well. Presuming your Id-based fields are integers/longs, you don't want to structure them as strings but rather as integers/longs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM