简体   繁体   English

SOLR多值字段

[英]SOLR multi valued fields

The Scenario: 场景:

I have the following (simplified) database table scenario: 我有以下(简化)数据库表方案:

ID   ProductName          ProductCategory   Colour   Price
----------------------------------------------------------
1    BatmanTShirt         T-Shirt           Black    22
2    BatmanTShirt         T-Shirt           Blue     20
3    SupermanTShirt       T-Shirt           Blue     19
4    SpidermanTrousers    Trousers          Red      28
5    SpidermanTrousers    Trousers          Black    30

My Wish: 我的希望:

In SOLR index, I would like this data to be mapped in a normalized way such that only 3 SOLR documents (as shown below) would be created instead of 5. 在SOLR索引中,我希望以规范化的方式映射这些数据,这样只能创建3个SOLR文档(如下所示)而不是5个。

<doc1>
  <ID>1</ID>
  <ProductName>BatmanTShirt</ProductName>
  <ProductCategory>T-Shirt</ProductCategory>
  <OtherDetails>{ {1, Black, 22}, {2, Blue, 20} }</OtherDetails>
</doc1>
<doc2>
  <ID>3</ID>
  <ProductName>SupermanTShirt</ProductName>
  <ProductCategory>T-Shirt</ProductCategory>
  <OtherDetails>{ {3, Blue, 19} }</OtherDetails>
</doc2>
<doc3>
  <ID>4</ID>
  <ProductName>SpidermanTrousers</ProductName>
  <ProductCategory>Trousers</ProductCategory>
  <OtherDetails>{ {4, Red, 28}, {5, black, 30} }</OtherDetails>
</doc3>

Some Notes: 一些说明:

  • <ID> will contain the minimum ID from the group <ID>将包含组中的最小ID
  • <OtherDetails> will contain the unique ID plus the other details that are left out when grouping. <OtherDetails>将包含唯一ID以及分组时遗漏的其他详细信息。 This would be a multi-valued field with data type of List holding another List of details {ID, Colour, Price}. 这将是一个多值字段,其数据类型为List,其中包含另一个详细信息列表{ID,Color,Price}。

Question: 题:

Anyone knows how is this possible? 谁知道这怎么可能?

PS PS

The reason for doing this 'grouping' move is that I want to facet on the ProductCategory. 进行这种“分组”移动的原因是我想要在ProductCategory上进行分析。 If I use faceting on ProductCategory, currently the counts generated will be: 如果我在ProductCategory上使用faceting,那么当前生成的计数将是:

T-Shirt (3)
Trousers (2)

Now what I want is to facet on the ProductCategory without Colour and Price data such that I want to have only 2 T-Shirts (one of Batman and one of Superman) and only 1 Trousers (Spiderman's). 现在我想要的是在没有颜色和价格数据的ProductCategory上,我想只有2件T恤(蝙蝠侠和超人之一)和只有1件裤子(蜘蛛侠)。 Therefore what I want to show is this: 因此,我想要展示的是:

T-Shirt (2)
Trousers (1)

I did some research and found out that this feature (which is called Post-Group Faceting or Matrix counts) is currently WIP, as noted in this SOLR patch . 我做了一些研究,发现这个功能(称为Post-Group Faceting或Matrix计数)目前是WIP,如本SOLR补丁中所述 So I want a temporary workaround since this may take a while to finish. 所以我想要一个临时的解决方法,因为这可能需要一段时间才能完成。

The patch works fine for single valued fields, so using this patch and grouping is the best way to go. 该补丁适用于单值字段,因此使用此补丁和分组是最佳方法。

Just index the data like it is in the database, so you don't need to use multi-value fields. 只需将数据索引为数据库中的数据,因此您不需要使用多值字段。

You can download the latest code with TortoiseSVN and apply patch. 您可以使用TortoiseSVN下载最新代码并应用补丁。 Building WAR (or JAR's) is very easy in Eclipse. 在Eclipse中构建WAR(或JAR)非常简单。 Just start new project with the code you just downloaded and run the ant scripts in the build.xml in the root and solr directory. 只需使用刚刚下载的代码启动新项目,然后在root和solr目录中的build.xml中运行ant脚本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM