如何为Solr模式定义文档级别和数据级别字段

Question

我有一个名为test.csv的简单文件，它具有以下数据

id,author,title
1,sanjay,ABC
2,vijay,XYZ

我希望在solr中为该文件建立索引，并将唯一的ID传递给它，名为id = 1，以便将来查询此文档（意味着所有值，即等同于从表名中选择*），并且同样希望对许多此类文件进行索引文件ID为ID的文件，例如ID = 2，ID = 3等。

在我的schema.xml中，id是一个字段

 <field name="id" type="string" indexed="true" stored="true" />

和

 <!-- Field to use to determine and enforce document uniqueness.
  Unless this field is marked with required="false", it will be a required field
 -->
 <uniqueKey>id</uniqueKey>

在文件中不存在id的实例中，但我想将id作为文档级别唯一性的参数传递，它发出了以下错误消息

 [root@****ltest1 garyTestDocs]# curl  http://localhost:8983/solr/update/csv?id='SL1' --data-binary @sample.csv -H    'Content-type:text/plain; charset=utf-8'
 <html>
 <head>
 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
 <title>Error 400 [doc=null] missing required field: ref</title> 
 </head>
 <body><h2>HTTP ERROR 400</h2> 
 <p>Problem accessing /solr/update/csv. Reason:
 <pre>    [doc=null] missing required field: id</pre></p><hr /><i><small>Powered by  Jetty://</small></i><br/>                                                
 <br/>                                                
 <br/>                                                
 <br/>                                                
 <br/>                                                
 <br/>                                                
 <br/>                                                

 </body>
 </html>

因此，从本质上讲，有两种情况，即在文件内用id列索引上述示例文件，而另一种情况是具有id列。 但是在两种情况下，我都需要传递文档级别的唯一ID，即id ='1'或id ='2'。

您能否用这两种情况以及curl语法和schema.xml（只是所需的字段）来解释您的答案？

Answer 1

在Solr中，将schema.xml想象成一个数据库表。 为了保持行的唯一性，我们在其中有一个主键列。 通常就像id列中具有唯一值。 当您在solr中为我的情况下的csv文件索引文档时，其中包含列。 id列必须是唯一的，并且不能有空行。 有很多方法可以创建唯一的字符串，但是仅出于例如我使用file_name_1 ...的格式（具有1,2,3 ...等填充序列）的目的。 这是在solr中指定记录唯一性的唯一方法。 您不能具有文档级唯一性，这意味着在编制索引时不能提供唯一键。 因此，在schema.xml中，您具有唯一的键标签，该键标签不过是文档中的唯一列而已。

索引csv文件的qry如下：-

curl http：//：8983 / solr / update / csv --data-binary @ Sample.csv -H'内容类型：文本/纯文本; charset = utf-8'

schema.xml将具有一个id col

 <field name="id" type="string" indexed="true" stored="true" />

我的文档中的某些列

 <field name="author" type="text" indexed="true" stored="true" />
 <field name="title" type="text" indexed="true" stored="true" />


 <uniqueKey>id</uniqueKey>

在索引时，我没有使用文档级别的唯一ID。 所以我希望我已经回答了我自己的问题！

如何为Solr模式定义文档级别和数据级别字段

问题描述

1 个解决方案

解决方案1
0 已采纳 2013-05-15 16:42:21

如何为Solr模式定义文档级别和数据级别字段

问题描述

1 个解决方案

解决方案1 0 已采纳 2013-05-15 16:42:21

解决方案1
0 已采纳 2013-05-15 16:42:21