Carrot2 cluster on multiple arguments

Question

Hallo,

I am working on an SearchEngine which uses Solr and Carrot2.

Everything is fine but Carrot does a weird thing into which I can't understand. I want to get Results from Solr and cluster them using Carrot. I integrated everything and it works good but Carrot is clustering on just one of my attributes. The one which matches the result and no other attribute. Like:

Data

name: Peter town: London hobby: golf, skiing

name: Arthur town: Berlin hobby: golf, biking

name: Paris town: London hobby: golf, hiking

Searching: golf

Cluster like: skiing biking hiking

..but not London.

That wouldn't supprise my by itself but when I use the CarrotClusteringWorkbench it does cluster on other arguments.

At first I tried to export the configurations from the workbench into Solrconfig but it changed nothing. Solr uses the configs but none of them is changing anything to this issue.

Can anyone help me or expain it?

Answer 1

You need to put the names of fields to cluster on in your solrconfig.xml. To replicate the configuration that worked for you in Carrot2 Clustering Workbench, put these in your clustering request handler (or provide in the query URL):

<!-- In Workbench this is "Title field name" -->
<str name="carrot.title">name</str>

<!-- In Workbench this is "Summary field name" -->
<str name="carrot.snippet">features</str>

In general, Carrot2 works best with natural / unstructured text, such as search results, document abstracts or content. If your fields contain strings denoting some structured data, the clusters will likely be far from what you're expecting (and from what a dedicated clustering algorithm could produce).

Carrot2 cluster on multiple arguments

Question

1 answers

solution1
0 ACCPTED 2011-07-14 10:15:41

Carrot2 cluster on multiple arguments

Question

1 answers

solution1 0 ACCPTED 2011-07-14 10:15:41

solution1
0 ACCPTED 2011-07-14 10:15:41