简体   繁体   English

Solr / Carrot2集成

[英]Solr/Carrot2 Integration

SOlr/Carrot2 Integration SOlr / Carrot2集成

i have multiple text files for each i created XML to index document on Solr as bellow 我为每个我创建的XML都有多个文本文件,以索引Solr上的文档

<add>
  <doc>
    <person>data </person>
    <organization>data here </organization>
    <content>Some spanish text here</content >
  </doc>
<add>

Schema used in Indexing 索引中使用的模式

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />    
<field name="person" type="string"  indexed="true" stored="true" required="true" multiValued="true" />
<field name="orgnization" type="string" indexed="true" stored="true" required="true" multiValued="true"   />
<field name="content" type="text_es" indexed="true" stored="true" multiValued="true"/>  
<field name="location" type="string"  indexed="true" stored="true" required="true" multiValued="true" />

Now i am trying to integrate carrot2 clustering ,for that i followed this link http://carrot2.github.io/solr-integration-strategies/carrot2-3.8.0/index.html 现在我正在尝试整合carrot2集群,为此我按照这个链接http://carrot2.github.io/solr-integration-strategies/carrot2-3.8.0/index.html

My Problem is as a result of cluster query i am getting only one cluster as bellow 我的问题是集群查询的结果,我只得到一个集群

<arr name="clusters">
  <lst>
<arr name="labels">
  <str>Other Topics</str>
    </arr>
    <double name="score">0.0</double>
    <bool name="other-topics">true</bool>
    <arr name="docs">
      <str>#.txt</str>
      <str>abci-britanicos-pizzerias-201312120250.txt</str>
      <str>abci-arqueologos-israelis-descubren-primer-201312111303.txt</str>
      <str>abci-autoridad-fiscal-pensiones-201312111956.txt</str>
      <str>abci-buenas-razones-para-cambiar-201312110933.txt</str>
      <str>abci-audio-asamblea-aserpinto-201312112139.txt</str>
      <
    </arr>
  </lst>
  </arr>

i should get more cluster My corpus contain 60 text documents 我应该得到更多的集群我的语料库包含60个文本文档

In order for search results clustering to work in Solr, the title and content fields you pass for clustering must be stored. 为了使搜索结果聚类在Solr中工作,必须存储为聚类传递的标题和内容字段。 The declaration in Solr schema could look like this: Solr模式中的声明可能如下所示:

<field name="content" type="text" indexed="true" stored="true" />

In addition to what Stanislaw said about fields being stored, please provide the query you used for clustering and, ideally, the full schema used to index your data. 除了Stanislaw关于存储字段的内容之外,请提供您用于群集的查询,理想情况下,请提供用于索引数据的完整模式。

If you have a mere 60 documents in your index and the query matches a small subset of documents then there will be nothing to cluster on. 如果索引中只有60个文档,并且查询与一小部分文档匹配,则无法进行聚类。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM