简体   繁体   中英

Solr facet equivalent of group by?

If I have some data like this:

{"field1":"x", "field2":".."}
{"field1":"x", "field2":".."}
{"field1":"y", "field2":".."}
{"field1":"y", "field2":".."}
{"field1":"y", "field2":".."}

Using a simple group=true&group.field=field1&group.limit=0 I get results like this:

{
  "responseHeader":{..}
  "grouped":{
        "field1": {
            "matches": 5,
            "groups": [

                {"groupValue": "x", "doclist":{"numFound": 2, ...}}
                {"groupValue": "y", "doclist":{"numFound": 3, ...}}

            ]
        }

  }
}

Using this, I know the num of documents found for each groupValue ( numFound ). The problem is I need to sort the resulting groups in descending order, which is not possible with either sort (a simple sort=numFound would result in an exception, saying the field numFound does not exists and the group.sort would sort the documents inside each group).

Is there an equivalent of this using facets where I can sort the results by count?

You can try:

http://localhost:8983/solr/your_core/select?facet.field=field1&facet.sort=count&facet.limit=-1&facet=on&indent=on&q=*:*&rows=0&start=0&wt=json

The result will be something like:

{
  "responseHeader":{
    "status":0,
    "QTime":17,
    "params":{
      "q":"*:*",
      "facet.field":"field1",
      "indent":"on",
      "start":"0",
      "rows":"0",
      "facet":"on",
      "wt":"json"}},
  "response":{"numFound":225364,"start":0,"docs":[]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{
      "field1":[
        "x",113550,
        "y",111814]},
    "facet_ranges":{},
    "facet_intervals":{},
    "facet_heatmaps":{}
  }
}

Just tested with Solr 6.3.0.

For more information you can check related part in the Solr documentation .

If you want to compute simultaneously the number of available facets, you can use Solr stats Component (as the field is of type numeric, string, or date).
Have in mind though, server performance and memory overhead issues might appear.

Running a query like:

http://localhost:8983/solr/your_core/select?facet.field=field1&facet.sort=count&facet.limit=10&facet=true&indent=on&q=*:*&rows=0&start=0&wt=json&stats=true&stats.field={!cardinality=true}field1

The response is something like:

{
  "responseHeader":{
    "status":0,
    "QTime":614,
    "params":{
      "facet.limit":"10",
      "q":"*:*",
      "facet.field":"field1",
      "indent":"on",
      "stats":"true",
      "start":"0",
      "rows":"0",
      "facet":"true",
      "wt":"json",
      "facet.sort":"count",
      "stats.field":"{!cardinality=true}field1"}},
  "response":{"numFound":2336315,"start":0,"docs":[]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{
      "field1":[
        "Value1",708116,
        "Value2",607088,
        "Value3",493949,
        "Value4",314433,
        "Value5",104478,
        "Value6",41099,
        "Value7",28879,
        "Value8",18767,
        "Value9",9308,
        "Value10",4545]},
    "facet_ranges":{},
    "facet_intervals":{},
    "facet_heatmaps":{}},
  "stats":{
    "stats_fields":{
      "field1":{
        "cardinality":27}}}}

For more information about stats you can check here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM