简体繁体 English

Apache Solr模式配置

[英]Apache Solr Schema Configuration

原文 2014-06-23 19:02:47 3 1 solr

So I am pretty new to Apache Solr and have a situation I do now know how to handle. 因此，我对Apache Solr并不陌生，现在我确实知道如何处理。 I am from an OO programming background so first let me explain the object relationships: 我来自OO编程背景，因此首先让我解释一下对象关系：

Take an object called Movie that has two text fields, title and description. 以一个名为Movie的对象为例，它具有两个文本字段，标题和描述。 A movie can be associated with tags by a user. 电影可以由用户与标签相关联。 These tags are particular to the user, and are not visible to other users. 这些标签是特定于用户的，其他用户看不见。

So an example Movie could have something like this: 因此，示例电影可能具有以下内容：

"Movie Title", "Description of the Movie" User1Tags: "tag1", "tag2" User2Tags: "action", "somethingElse" “电影标题”，“电影说明” User1Tags：“ tag1”，“ tag2” User2Tags：“ action”，“ somethingElse”

I need to design a schema/solr query so that when user1 is searching for movies, if they type "action", the movie above will not show up. 我需要设计一个架构/ solr查询，以便当user1搜索影片时，如果他们键入“ action”，则上面的影片将不会显示。 This is because user2 has associated "action" with "Movie Title", not user1. 这是因为user2将“动作”与“电影标题”而不是user1相关联。

Things I have considered: 我考虑过的事情：

1) Filter queries - these do not seem to work as once the index per movie is built, I do not see how to avoid having all the user tags be tied to the movie's index. 1）过滤器查询-这些查询似乎不起作用，因为一旦建立了每部电影的索引，我就看不到如何避免将所有用户标签都绑定到电影的索引。

2) A separate core for movie to tag associations and just doing two queries per search. 2）电影与标签关联的独立核心，每次搜索仅执行两个查询。 I know I can do it this way, but making another core seems excessive to me. 我知道我可以用这种方法来做，但是对我来说，再做一个核心似乎太过分了。

Are there other options I am missing? 我还有其他选择吗？ Or is there a way to implement 1? 还是有办法实现1？ Or is the simplest option just option 2 and that's how people who know what they are doing with Solr would do it? 还是最简单的选择就是选择2，这就是知道自己在Solr上做什么的人会怎么做？

1 个解决方案

How many users? 有多少用户？

If not many, then you can have dynamic fields tag_user1, tag_user2 and modify the eDismax field list to match or not match against it, eg by using field name alias. 如果不是很多，则可以具有动态字段tag_user1，tag_user2并修改eDismax字段列表以使其匹配或不匹配，例如通过使用字段名称别名。

The other option is to prefix the values with the userid. 另一个选项是在值前加上用户ID。 So tags field would have: user1_tag1, user1_tag2, user2_action, user2_somethingElse. 因此标签字段将具有：user1_tag1，user1_tag2，user2_action，user2_somethingElse。 Then, you need a custom filter in the query chain that will prefix your search tokens with the user of the request and so only prefixed values would match. 然后，在查询链中需要一个自定义过滤器，该过滤器将为搜索令牌加上请求的用户前缀，因此只有前缀值会匹配。