简体   繁体   English

使用后缀将2种不同的语言合并为一个SOLR索引-然后如何查询

[英]Merge 2 different languages into into single SOLR index using suffix - then how to query

I know that there are similar questions about solr and I read them all and some give insights but not a solution for exaclty what I am trying to do. 我知道关于solr也有类似的问题,我都阅读了所有这些问题,有些问题提供了见解,但不是解决我想做的事情的解决方案。

  1. I have a table events that contains the columns eventid, name, description in English 我有一个表事件,其中包含列eventid,名称,英文说明
  2. I have a table esp_events that contains the columns eventID, name, description in Spanish 我有一个表esp_events,其中包含西班牙文列eventID,名称,描述

Right now we only index the english version so I want to add the Spanish version into the solr index also. 现在我们只索引英文版本,因此我也想将西班牙文版本添加到solr索引中。 As the eventid is identical in both tables I don't want to have that included for the indexing portion but obviously we will need it to pull the data from both tables using the same eventid. 由于两个表中的eventid相同,因此我不想为索引部分包括该ID,但显然我们需要使用相同的eventID从两个表中提取数据。

So my questions are: 所以我的问题是:

  1. How do I define the data to be indexed (name, name_esp, description, description_esp). 如何定义要建立索引的数据(名称,名称_esp,描述,description_esp)。
  2. Do I need to define a table that the data is sourced from - if so - how is that done. 我是否需要定义一个数据来源的表-如果是的话-怎么做的。
  3. How do I tell the php application to request the search be done against the English or Spanish version of the fields being searched upon. 我如何告诉php应用程序请求针对要搜索的字段的英语或西班牙语版本进行搜索。

I did not set up the original config for SOLR so I would appreciate you letting me know which files need to be modified to get this all to work. 我没有为SOLR设置原始配置,所以希望您让我知道需要修改哪些文件才能使所有这些正常工作。 eg solr-config.xml and schema.xml - plus any I am not aware of. 例如solr-config.xml和schema.xml-加上我不知道的任何内容。

I am also open to a completely different solution to the one I outlined - as long as its not too complex. 我也愿意对我概述的解决方案提出完全不同的解决方案-只要它不太复杂即可。

Thanks. 谢谢。

This is usually implemented by having separate versions of the field in the schema for each language, such as name_en , name_es , description_en , description_es etc. (as you write). 这通常是通过在模式中为每种语言使用不同的字段版本来实现的,例如name_enname_esdescription_endescription_es等(在您编写时)。

If you're using DIH, you can perform a join in the query (or use a nested entity) to retrieve the fields from the alternative language table as well. 如果您使用的是DIH,则还可以在查询中执行联接(或使用嵌套实体),以从备用语言表中检索字段。

If you know which language you're querying in, you can use qf (query fields) to tell Solr which fields to search. 如果知道要查询的语言,则可以使用qf (查询字段)告诉Solr要搜索哪些字段。 name_es,description_es if the search is in Spanish, name_en,description_en if it's in English. name_es,description_es如果使用西班牙语), name_en,description_en如果使用英语)。

There are also a feature in more recent versions of Solr (3.5 and up) for explicit Language Detection . Solr的最新版本(3.5及更高版本)中还具有用于显式语言检测的功能

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM