I'm new to Solr, and I don't know if this is the best way to do it:
I have some products, that are classified into several categories. The categories are organizied in a hierarchical structure like
- Electronics
- Computer
- Apple
- iPads
- Macbooks
- Samsung
- Notebooks
- Photo
- Fashion
- Women
- Men
- Shirts
Every product can have multiple categories. For example, a product could be in Electronics > Computer > Apple > Macbooks
and Electronics > Computer > Notebooks
. Listing products of Electronics
should return all underlying products, including all subcategories. Listing products in Electronics > Computer
should only return products from that subcategory.
My shop is in Rails and it uses sunspot as a DSL for Solr. In sunspot, I have a field called category_names
, which has multiple: true
and stored: true
. In this field, I store multiple categories, from root to the deepest subcategory, that are stored in Solr like this:
<arr name="category_names_sms">
<str>Electronics</str>
<str>Electronics#Computer</str>
<str>Electronics#Computer#Notebooks</str>
<str>Electronics#Computer#Apple</str>
<str>Electronics#Computer#Apple#Macbooks</str>
</arr>
When I want to retrieve all categories as a facet search, I just call Solr with facet=true&facet.field=category_names
, and it returns sth like
<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields">
<lst name="taxon_names_sms">
<int name="Electronics">2831</int>
<int name="Electronics#Computer">1988</int>
<int name="Electronics#Computer#Apple">543</int>
...
</lst
</lst
</lst>
When I want to only fetch products from a certain category, I'm calling Solr with fq=category_names:Electronics
and it returns all the products from that category. And because every product also contains the path to the root category, I also get products from the subcategories.
I've read some articles about pivot faceting, hierarchical faceting... and I'm a little bit confused, if I use the Solr features right. My questions are:
#
hashtag to split and parse the categories on the client side, and that's a point I don't like. Thanks a lot, I hope someone can push me into the right direction.
I can see the structure you have is quite complex, I will suggest you not to go that way with Solr.
although Solr 4.0+ can do a limited join functionality, that is not his strong point. have a look at this article (expecially the part "Hiearchy and Relations makes Solr sad"): http://bibwild.wordpress.com/2011/01/24/thinking-like-solr-its-not-an-rdbms/
and this one for a help on how to denormalize your database to work best in Solr: http://mysolr.com/tips/denormalized-data-structure/
I also don't like that solution.
What will you do when cattegory name is changed? You'll have to reindex all products in that category. I think it is better way to do one db query.
Solr has support of pivot facets. So you can use it:
If category's level is unlimited you should use dynamic field:
<field name="categories" type="int" indexed="true" stored="true" multiValued="true"/>
<dynamicField name="category_*" type="int" indexed="true" stored="true" multiValued="true"/>
If you want to fetch products only from Electronics (for example it id is 20 and level is 1):
fq=categories:20&fq={!tag=no_subcat}NOT category_2:[* TO *]
And you can build facets for Electronic child and subchild categories:
facet.pivot={!ex=no_subcat}category_2,category_3
I've never used ruby.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.