简体   繁体   中英

How to use countDistinct to get distinct values grouped by … without using json.facet

Using solr 5.2.1 I'm trying to do something that in sql would look like:

SELECT COUNT(DISTINCT(SESSION_ID)), COUNTRY FROM LOG
GROUP BY COUNTRY

The following answer would work but uses json.facet and I would like to create a banan panel for this query without having to re write the query and filter services.

This is what I got so far:

stats.countDistinct=true stats.distinctValues=true/false

JSON response:

  "responseHeader":{
    "status":0,
    "QTime":3,
    "params":{
      "q":"*:*",
      "stats.countDistinct":"true",
      "indent":"true",
      "stats":"true",
      "stats.facet":"country_s",
      "fq":"serverUtc_dt:[2015-09-01T07:59:00.000Z TO 2015-09-01T07:59:01.000Z]",
      "rows":"0",
      "wt":"json",
      "stats.distinctValues":"false",
      "stats.field":"sessionid_s"}},

It does not matter if distinctValues is true or false, no countDistinct value is provided in the result.

The following:

stats.calcdistinct=true

JSON response:

  "responseHeader":{
    "status":0,
    "QTime":7,
    "params":{
      "q":"*:*",
      "stats.calcdistinct":"true",
      "indent":"true",
      "stats":"true",
      "stats.facet":"country_s",
      "fq":"serverUtc_dt:[2015-09-01T07:59:00.000Z TO 2015-09-01T07:59:01.000Z]",
      "rows":"0",
      "wt":"json",
      "stats.distinctValues":"false",
      "stats.field":"sessionid_s"}},

This seems to be doing what I want but adds hundreds of thoudsands of distinctValues to the result.

According to the documentation calcdistinct would set countDistinct and distinctValues to true but replacing calcdistinct with countDistinct and distinctValues true does not do the same thing.

Is there a way to get the count distinct without getting the hundreds of thousands of distinct values as well?

Can this be done without using facet.json?

You have to use the stats.field param to solve this, the distinctValues or countDistinct can't be used directly.

In my problem I need only the distinct count of the primary domains.

"params":{
      "q":"*:*",
      "stats.calcdistinct":"true",
      "indent":"true",
      "stats":"true",
      "rows":"0",
      "wt":"json",
      "stats.field":["{!key=c_primary_domain}c_primary_domain",
        "{!distinctValues=false}c_primary_domain"]}},

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM