[英]Solr - how to sort by frequency of values on a specific field?
I'm working with Java and SolrJ on Eclipse. 我在Eclipse上使用Java和SolrJ。 How can I sort the results of a SolrQuery by occurrency of values on a certain field?
如何通过某个字段上的值的并发来对SolrQuery的结果进行排序? For example, when I search top n articles ( docType=0 ) of a particular author I want to sort query results by frequency of values in the journal_facet field (type String).
例如,当我搜索特定作者的前n篇文章( docType = 0 )时,我想按照journal_facet字段中的值的频率(类型String)对查询结果进行排序。
If a certain author X has written: 如果某个作者X写了:
Order has to be a2, a3, a4, a0, a1, a5 and I want to show results in the following way 订单必须是a2,a3,a4,a0,a1,a5,我想以下列方式显示结果
<doc>
<arr name="author">
<str>X</str>
</arr>
<str name="title">a2</str>
<str name="journal">J1</str>
</doc>
<doc>
<arr name="author">
<str>X</str>
</arr>
<str name="title">a3</str>
<str name="journal">J1</str>
</doc>
<doc>
<arr name="author">
<str>X</str>
</arr>
<str name="title">a4</str>
<str name="journal">J1</str>
</doc>
<doc>
<arr name="author">
<str>X</str>
</arr>
<str name="title">a0</str>
<str name="journal">J0</str>
</doc>
<doc>
<arr name="author">
<str>X</str>
</arr>
<str name="title">a1</str>
<str name="journal">J0</str>
</doc>
<doc>
<arr name="author">
<str>X</str>
</arr>
<str name="title">a5</str>
<str name="journal">J2</str>
</doc>
My query is 我的疑问是
SolrServer solrServer = new HttpSolrServer(urlString);
SolrQuery query = new SolrQuery();
query.set("q", "docType:0);
query.set("fq", "author:X");
query.set("fl", "author, title, journal");
query.setRows(n);
...
QueryResponse response = solrServer.query(query);
SolrDocumentList results = response.getResults();
and in my Solr schema.xml there are the following fields and types 在我的Solr schema.xml中有以下字段和类型
<types>
...
<fieldType name="text_title" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<charFilter class="solr.HTMLStripCharFilterFactory" />
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
stemEnglishPossessive="1" preserveOriginal="1" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
<filter class="solr.KStemFilterFactory" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"
stemEnglishPossessive="1" preserveOriginal="1" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.KStemFilterFactory" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory" />
</analyzer>
</fieldType>
<fieldType name="text_name" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<charFilter class="solr.HTMLStripCharFilterFactory" />
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="1" splitOnCaseChange="1" />
<filter class="solr.LowerCaseFilterFactory" />
<!-- n-grams utile per la ricerca per prefisso" -->
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
<!-- <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> -->
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" />
<filter class="solr.LowerCaseFilterFactory" />
<!-- <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> -->
</analyzer>
</fieldType>
</types>
<fields>
<field name="docType" type="tint" indexed="true" stored="true"
multiValued="false" required="true" />
<field name="key" type="string" indexed="true" stored="true"
multiValued="false" required="true" />
<field name="mdate" type="date" indexed="true" stored="true"
multiValued="false" required="true" />
...
<field name="author" type="text_name" indexed="true" stored="true"
multiValued="true" />
...
<field name="journal" type="text_title" indexed="true" stored="true"
multiValued="false" />
<field name="title" type="text_title" indexed="true" stored="true"
multiValued="false" />
...
<field name="journal_facet" type="string" indexed="true" stored="true"
multiValued="false" />
...
<copyField dest="journal_facet" source="journal" />
...
</fields>
Thanks a lot for your help. 非常感谢你的帮助。
What about writing custom function query and sorting by it:
如何编写自定义函数查询和排序:
http://localhost:8983/solr/select?q=*:*&sort=dist(2, point1, point2) desc
References 参考
If you facet your results, just use facet.sort
to get facets sorted by frequency: 如果您
facet.sort
结果,只需使用facet.sort
获取按频率排序的方面:
https://wiki.apache.org/solr/SimpleFacetParameters#facet.sort https://wiki.apache.org/solr/SimpleFacetParameters#facet.sort
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.